LINEAR 


S. Kumaresan 


Contents 


Preface ix 
Note to the Students xi 
Notation xiii 
1. Systems of Linear Equations 1-12 
LK, PD obras Fare sais asiss ss cstacavinsatcnscstecesreavesaberssdoievicenoies | 

1.2 Systems of Linear Equations ou... ecssssesecssssscssesesceees 2 

2. Vector Spaces 13-42 
O41 efinition Gti BXAMDIES scscssesssescscrcesecscscsisesveastvertekaua 13 

BY “Vector DUNSO ACCS :ixisicexsasiccvevscevaverciasinsiucsubvanssarsedeecesewessiceees 21 

2.3 Basis and Dimension of a Vector Space .....ccccscssesceseees 28 

3. Lines and Quotient Spaces 43-52 
8.1. Definition of a Liners secdestel lies ceesencces 43 

Bo ATIUG SGCES sescsscecnssvessicnssvinsrnaciess ees Null tices 46 

8.8 Quotient Space wncsaresserssssoersstectedeenrssbecstnsascpidrsectrolerdeorerces 47 

4, Linear Transformations 53-85 
MY Larrea Trenshor iia taoni io cssissccssasesccccsnsvsnscnirvaagessesanasesidessereas 53 

4,2 Representation of Linear Maps by Matrices .............0000 61 

4.8 Kernel and Image of a Linear Transformation ............... 68 

44 Linear Tsomorphisttt: siisscictsicnsniisintimbingwidaueas somes 72 

4.5 Geometric Ideas and Some Loose Ends...........ccecssssseeee 76 

4.6 Some Special Linear Transformations .........cscseseseseeeeees 78 

5. Inner Product Spaces 86-180 
G.1 Inner. Product Spaced.....ssscassscstedlshecariteresbisinstenerirtnevesses 86 
5.1.1 The Euclidean Plane and the Dot Predict iSiineiwss 86 

5.1.2 General Inner Product Spaces ...sssssssesssssesnersenens 89 


Vv 


Scanned by CamScanner 


i Contents 

i 

5.2 Orthogonality .....eceeee cna cal eenennah ° ° 
5.3 Some Geometric Applications ....s.sssssesersersreerererererereeerteees : 

5.4 Orthogonal Projection Brith Bi, LAO ca vesgavorsoncensatcmiceeneats 102 

5.5 Orthomormal Basis ...ssssssssssessseeseeeeeeseeeseeseneeneneneeeneneneeeans 105 

5.6 Orthogonal Complements Gi. PYOjOC MONS vesacorisasdvecccevens 110 

5.7 Linear Functionals and Hyperplanes .........sssssesseeeseeees 114 

BT Pip ai Plath is csccsncsccseaseascaicrsntinencsiartsinioccevanssnivonndisente 116 

B.S: Crdhogonal Transformations sisscsissacssintasesssnaaseorncssssnspecsisns 121 

5.8.1 Coordinates Associated with an Orthonormal 

POEL sees ecsicinicwalssancpcietutteahiinunpadevetunuishedcpuniieeshseasvetla 126 

5.9 Reflections and Orthogonal Maps of the Plane............... 127 

Bd, TOUS coccstiveneeccnstidactitenent ns ee ees 127 

5.9.2 Orthogonal Maps of the Plane wo... cccecseesescssesees 129 

Determinants 131-163 

6.1 2x 2 Determinant as Area of a Parallelogram.............. 131 

6.2 Determinant and its Properties ......cccccescssescssssescesescesees 135 

6.3 Computation of Determinants .......c.cccccccsessssssescsesseceeeeees 144 

6.4 Basic Results on Determinants .......cccccccssscsesssesseesseeesees 147 

6.4.1 Laplace Expansion .........c..ccccscssssesessssesssssscscsessseececees 151 

VEZ: COOTOE S BUG ioassoessacssitcturente atiaitesiacaventetoestineniinces 154 

6.4.3 Some Geometric Ideas ..........csescsscscssesssesssescssceseare 158 

6.5 Orientation and Vector Product ........cccsscccscssssssscesscesees 160 

G.5.1 Orientation siescsscisnseveavesscsecssnevisensdescetyecOt, cl Pasa ee 160 

6.5.2 Vector Productissiissssrsvesresvnn Pinte e deeds 160 

Diagonalization 164-183 

7.1 Rotation of Axes of Comics .....:sssolsisssosessdssstctecssessstseidosoedle 164 

7.2 Higenvalues and Bigenvectors .........sccsscssesssssssscsscsecscees 167 

7.2.1 Cayley-Hamilton Theorem .........csssssssssescesesees 174 

7.3  Diagonalization of Symmetric Matrices... 176 

Classification of Quadrics 184-196 

$.1 Comics and Quadrics ............s.icsssiscosseensscttbied doen locsnass 184 

8.1.1 Classification of QUAUTICS «0.0... ecceessesescsestetseecsees 187 

8.2 Computational Examples .........ssssescsecssesscsneessseessseesneesnes 189 


Scanned by CamScanner 


Contents 


Vii 
; Problems 
5. Review 197-215 
9.1 Linear Equations... cece. 
iinet, i aga ec err ee 197 
9.2 Linear Dependence .....00.............. 
: Be APH Mseaanventacwvasidensasinussasssesesaessdsnesocenseesssls 197 
9.3 Basis and Dimension.................., 
. Uh snosrenenessssnes sito scesncasaraheccanotssechincdah 198 
9.4 Linear Transformations 000.0... esses... 200 
9.5 Buclidean Spaces wesc 
PAPAS ape tates tvest Asan tuicpsonescuntebaiasvinacanesaysiebc 203 
9.6 Problems in Linear Geometry ..... 
GE EONS NN TNE MACON, iinsrelsshcarinsiseates recnceatene 205 
9.7 Miscellaneous Problems... seececssesess.., 211 
Silos 217-218 


Scanned by CamScanner 


xii Note to the Studen; 
sneer SS 


Lastly, do not assume that I am always right. Have a health 
scepticism and check out the statements. In case you find an 
typographical or mathematical mistakes, make sure that you a) 
right. This will give you immense confidence. (I enjoyed this aspect , 
reading a book when I was a student.) I also request you to send m 
your list of my mistakes and other suggestions for improvement j 
future editions. 


Scanned by CamScanner 


Notation 


We use the following notation. 
N stands for the set of positive integers. 
Z stands for the set of all integers. 
Q stands for the set of all rational numbers. 
R stands for the set of real numbers. 
C stands for the set of all complex numbers. 
If X is any set, then X” stands for the cartesian product of X with 
itself n-times: X” = X x...x X 
ee 


n times 


The notation := signifies that the left side object is defined by the 
right side. For example, 


R, = {ke R: x20} 


defines the set of positive real numbers. If we write X := R", then it 
means that the set X is taken to be the set R". 


xill 


Scanned by CamScanner 


1. Systems of Linear 
Equations 


1.1 A Motivating Example 


Before beginning the theory of vector spaces, let us start with an example 
which will serve as a motivation as well as a precursor to what is to follow. 

A word of caution: Do not be overly concerned with where m, n and z 
are from. If you wish you may assume that they are rational numbers. 

Suppose in my neighbourhood there is an eccentric shopkeeper. He is 
convinced that north Indians eat more wheat than rice and south Indians 
eat more rice than wheat. So he offers only two standard packets. The first 
packet, call it NV’, has 5 kilograms of wheat and 2 kilograms of rice, whereas 
the second packet, call it S, has 2 kilograms of wheat and 5 kilograms of 
rice. Let us invent a shorthand. Whenever we write (m,n) we mean m kg 
of wheat and n kg of rice. Now if I buy 3 packets of N, it means that I 
am buying 15 kg of wheat and 6 kg of rice, that is, 3N = 3(5, 2) = (15,6). 
Similarly, 2 packets of S means 4 kg of wheat and 10 kg of rice, that is, 
2S = 2(2,5) = (4,10). 

If I buy one of each of the packets, then I would have bought 7 kg of 
wheat and 7 kg of rice. That is, 


N +S = (5,2) + (2,5) =(5+2,2+5) = (7,7). 


Thus if I need m packets of N or n of S or both, there is no problem. 
Suppose I need 19 kg of wheat and 16 kg of rice. What do I do? I need to 
buy zx packets of N and y packets of S so that z(5, 2) + y(2,5) = (19, 16). 
That is, (52,22) + (2y,5y) = (19, 16) or (52 + 2y, 22 +5y) = (19, 16). Thus 
I end up solving a system of linear equations 


5r+2y = 19 
2r+5y = 16. 


{ees . re ze é oe a, + Lame - fe Bi . 
1 ERR les 5 5 SG 
Scanned by CamScanner 


20 _sSysteems of Linear Equations 


I solve this system using the methods we learnt in high school, I see that I 
need to buy 3 packets of V’ and 2 packets of S. 

Suppose I need 34 kg of wheat and 1 kg of rice. Then I find T must buy 8 
packets of V and -3 packets of S. What does this mean? I buy 8 packets 
of N and from these I make three packets of S and give them back to the 
shopkeeper. How nice of him to accept these! 

A little more thinking would convince you that if you want to buy m kg 
of wheat and n kg of rice, you can always find ¢ and y such that buying 
packets of V and y packets of S$ does the job. 

f course if the shopkeeper is simple minded, he would be selling a 
packet €; containing 1 kg of wheat and 0 kg of rice and another packet e, 
containing 0 kg of wheat and 1 kg of rice, thereby making our life easier! 

One instructive exercise in the same vein. 


Exercise 1.1.1 Let us assume that the shopkeeper sells packets P; of 1 kg 
of wheat and 1 kg of rice and 1 kg of turdal, and P) containing 1 kg of 
wheat, 0 kg of rice and 1 kg of turdal and P; comprising 0 kg of wheat, 
1 kg of rice and 1 kg of turdal. Is it possible for me to buy only one kilogram 
of turdal? 


We also note the following. In the first example of (5,2) and (2,5), if I 
want to buy (m,n), there is exactly one way of doing this: Buy z packets 
of N and y packets of S where z and y are rational numbers. 

On the contrary, consider the case when there is another packet 
P =(1,1). One then easily checks that there are many ways: For example, 
(7,7) =1N +15 = 7P. 

One last remark. Let the packet P; = (5,2) be priced at Rs. Ry and P, 
at Rs, Ry. Then if f is the price of m packets of P; and n packets of Py, 
we see that f(mP, +nP2) = mf(P)) + nf(P2) = mR; +nRo. 


1.2 Systems of Linear Equations 


The recurring themes in Linear Algebra are: 


(1) solutions of linear equations, and 


(2) their geometric interpretation. 


Scanned by CamScanner 


Systems of Linear Equations 3 


say that the point (x,y) € R? satisfies the equation or is a solution of the 
equation. The geometric interpretation of the equation is that the set of 
all points satisfying the equation forms a straight line in the plane through 
the point (c/b,0) and with slope ~a/b. 

The collection of linear equations 


1%, +°'*+Aintnp = by 
G72, +°'*+ Ant, = be 
Am1Z1 +++*+Amnln = bm 


is called a system of m linear equations in n unknowns 2),...,Z,- Here 
a;;, bi € R are given. We shall write this in a short form as 


> aigary = Op: l<i<m. . (1.2.1) 
j=1 
To solve this system is to find real numbers z),... ,2, which satisfy the 
system. Any n-tuple z := (2,... ,2,) which satisfies the system is called 


a solution of the system. 
For example, consider the case when m = 2 and n = 3. Then, 


43121 + Q)272+43373 = by 
Q2121 + A2272 +.42373 = Op. 


Each of the above equations determines a plane, assuming at least one of 
a,; #0 and ag; # 0. Hence, the set of solutions of the system is the set of 
all points which lie on both the planes, that is, the set of solutions is the 
intersection of the planes. 

Consider the system 


2x — 3y 0 
—8r + 12y/ 1. 


We see that this system has no solution. Geometrically, each equation is a 
line and these are two distinct but parallel lines. Hence, their intersection 
is empty. - 

If each 6; in the above system (1.2.1) is zero, then the system is said to 
be homogeneous. The homogeneous system in n variables (unknowns) is 


413%, t+''+Qjntn = O 
Q910, +'''+Qntn = 0 
AGmiti t'''+Amntn = 0 


ye 


Scanned by CamScanner 


yy 


4 Systems of Linear Equations 
Ne — es 


Again, we adopt a shorthand notation 


n 


S> ayz; =0, l<t<m. (1.2.2) 
j=l 
Note that 0 = (0,... 0) is always a solution for the homogeneous system, 
This solution is called the trivial solution. We say z = (71,.-.,2n) isa 


non-trivial solution if (z},...,2n) ¥ (0,...,0). That is, if there exists at 
least one 1 such that z, # 0. 

A non-trivial solution need not always exist. For example, consider the 
system of two linear equations in two unknowns: z = 0 and y = 0. 

Let (a,,... ,@,) be a solution of the homogeneous system (1.2.2). Then 
we see that (aq},...,a@a,) is again a solution of (1.2.2) for any a € R, 
This has the following geometric interpretation in the case of the three 
dimensional space R°. 

Let (r,s,¢) be a solution of the system ar + by+cz = 0. That is, 
ar +6s+ct =0. Then the solution set is a plane through the origin. Now 
the plane contains the two points (0,0,0) and (r,s,t) and hence all the 
points on the line joining them. But any point on the line is of the form 
a(r,s,t) for some a € R. For, recall the equation joining these two points 
is given by 

z-0 y-0 2z-0 


O-r O-s O-t 
say, a constant —a. It is then immediate that z = ar, y= as and z = at. 
Also if (bj,... ,6,) is another solution of (1.2.2), then 


(a; + 0,... 1 dn + bn) 


is again a solution of (1.2.2). We describe this as the set of solutions of a 
homogeneous system of linear equations is closed under addition and scalar 
multiplication. 

This suggests the following definition of “addition” and “scalar multi- 
plication” on R”, the set of n-tuples of real numbers. If c = (z1,... , Zn) 
and y = (y1,... Yn) anda ER, define r+y:= (21 +41,...,2n + Yn) and 
QZ := (a7},... ,@Zq). 

However, the set of solutions of a non-homogeneous system of linear 
equations need not be closed under addition and scalar multiplication. For 
example, consider the equation 3z — 2y = 1. a = (1,1) is a solution but 
a(1,1) = (a,q) is not a solution if a # 1. Also, b = (2,23) is another 
solution but (1,1) + (2,24) = (3,35) is not a solution. Note that (0,0) is 
not a solution. . 

The homogeneous system given by })_, aij2j = 0,1 <1 < mis called 
the associated homogeneous system of (1.2.1). 


Scanned by CamScanner 


systems of Linear Equations 5 


Let S be the set of solutions of the non-homogeneous system and S;, be the 
set of solutions of the associated homogeneous system of equations. Assume 
S$ £0. S), is always non-empty, as the trivial solution (0,... ,0) € Sp. 

Let « € S and y € S,. We claim that for any a € R, z+ ay € S. Since 
rz € S we have D j= aj;zj = b;. Similarly, Lut aisy; = 0 for 1 <i<m. 
Fora € Rand 1<i<™m, we have 


n 


n n 
) ai;(zj + ay;) = ) aiyjzj+ay 4i;Yj 
= j=l j=l 
n 
= \ | aja; 
j=1 


b forl <i<m. 


This proves our claim. 
Let us try to understand the geometry underlying this observation. Let 
us look at a simple equation 


r+y=2. (1.2.3) 


This obviously represents a line-in the plane R? passing through the point 
(1, 1) (see Figure 1.2.1). That is, the solution set S is a line passing through 
(1,1). The solution set Sp, of the associated system z+ y = 0 is the 
line passing through the origin whose “slope” is —1. Note that the line 
S, is parallel to the line S so that to get the solution set S we need to 
take the line parallel to the solution set 5, of the associated homogeneous 
system and passing through any point in S (for example, in this case, (1, 1)). 
Similar geometric considerations apply to a more general equation of the 
type ar + by=c. 


Ss; =—_ —> S$ 


Figure 1.2.1 Solution of a linear system. . 


Scanned by CamScanner 


6 Systems of Linear Equations 
KT 


Exercise 1.2.1 Extend the geometric interpretation to 


(1) an equation of the type az + by + cz =d, and 


(2) a system of two equations in three variables. 


LES = (iyixs (Sa) € o One & = (2), <.: Zn) € S, then we Aj;2z; = b; 
and Dye1 a;;z; = bj. Therefore, 


n n n 
Y aij(2; -2;)= ) aij; - S 5 a523 = b; - 6; =0. 
j=l j=l j=l 


That is, if z and z are any two solutions of the non-homogeneous system 
then z — z is a solution of the homogeneous system. That is, z — z € Sp). 

Do you see the geometric meaning underlying this? For instance, in the 
case of Equation (1.2.3), it says that if we take any two points p = (z, y) 
and q = (z',y’) on the line S, their “difference” p—g := (rx—2',y—y’) lies 
on the parallel line through the origin. 


Exercise 1.2.2 What is the analogue of the geometric meaning of the cases 
in Exercise 1.2.1? 


We put the above observations into a single fact: Fix z € S. Then if we 
define z+ S, := {x+y | y € Sy} what we saw above is that r+ S, C S, 
Also, for all z € S,z =2z+(z—2z) €2+Sp. This implies SC r+ Sp. 
Therefore, S = 2+ Sp. This z is called a particular solution of (1.2.1). So 
we have the following fact: 


To find all the solutions of (1.2.1) it is enough to 
find all the solutions of the associated homogeneous 


system and any one particular solution of (1.2.1). 


We look at some examples. These are mainly for the purpose of reviewing 
the so-called Gaussian elimination method of solving linear equations. The 
idea is to eliminate the first variable, say 2, and reduce the system to 
another in which, except for the first equation, the rest are linear equations 
in fewer number of variables. We repeat the above process with the system 
so obtained by deleting again the first equation till we are finally left witha 
single equation. In this last equation, except for the first z; terms, the rest 
of the variables are treated as “free” and assigned arbitrary real numbers. 


The best way is to go through some of the examples below and work some 
more on your own. 


Scanned by CamScanner 


Systems of Linear Equations 


Example 1.2.1 Consider the system 


I, := «xt+2%y+3z = 3 
[Io := Wwt+d3yt+8z = 
Lz := 32£4+2y4+17z = 1. 


We let aL; + BL; stand for the addition of a times the ith equation with 
B times the jth equation. Now, 2L; — Lz and —3L, + L3 are 
2+4y+6z = 6 -3r-6y-9z = -9 
—2zr — 3y — 8z —4 and $2+2y+17z = 1. 


Il 


We thus get a system of two linear equations in two variables y and z 


y-2z = 2 
—4y + 82 —8. 


The last two equations are essentially the same as the second is —4 times 
the first. Thus, effectively we have only one equation in two variables: 
y—2z = 2. We think of z as a “free” variable and assign to it any arbitrary 
value, say t. Then y = 2 + 2t. Substituting these values of y and z in the 


first equation of the given system, we get zr = —1 — 7t. Thus the solution 
set S of the given system is 
S = {(-1-7t,2+ 2t,t) | teR} 


(—1,2,0) + {t(-7,2,1) | te R}. 


Note that (—1,2,0) is a particular solution of the original system and 
(—7t, 2t,t) is a solution of the homogeneous system for any t € R. We 
shall be brief in the rest of the examples. 


Example 1.2.2 Consider the system 


Ty r+ytz 
Log := 2e-yt+z 


1 
a 


The obvious thing to do is to consider L,; + La thereby eliminating the 
y-variable and getting the equation 3z + 2z = 3. We treat z as the free 
variable and assign the value ¢t to z: z:= t. We then get z = 1 — (2/3). 
Substituting this value in the first of the given equations, we get y = —t/3. 
Thus the solution set S is given by 


{(1-2,-441) | cen} 
(1,0,0) + {¢ -3.-5:1) | ter} 


S 


Scanned by CamScanner 


8 Systems of Linear Equations 
eee 


Note that (1,0,0), a particular solution of the given system — a point 
of the three dimensional space — lies simultaneously on both the planes 
defined by the equation of the system. And (-: i, 1) is a point of R3 
which lies on the planes through the origin corresponding to the associated 
homogeneous system and hence all the points on the line joining it and the 


origin also lie on the planes through the origin. 


Example 1.2.3 Consider the system 


[Ty := 2 -2292+23+%, = 
[Ip := 2,-222+23-24 = —l 
[3 := 2,—2t9+273+524 = 


Here L, — Lz yields 27, = 2 and L, — L3 yields —4z4 = —4. Both these 
equations are equivalent and we see that 4 = 1. Substitution of this value 
of zs in the above equations yields a single equation 2; — 272 + 73 = 0. 
This is a linear equation in three variables and we may think of r2 and z3 
as free variables. So, we let zz = s and z3 = t so that z} = 2s — t. Hence 
the solution set is 


S := {(2s—t,s,t,1) | seRt eR} 
= {(0,0,0,1) + (2s, s,0,0)+(-t,0,t,0) | se€R, te R} 
= (0,0,0,1) + {s(2,1,0,0) +#(-1,0,1,0) | s€R, te R}. 


Example 1.2.4 Solve the system 


21+ Or. + 4r3 — 74 


Oz, +22 — 273 — 324 8. 

A look at the second equation shows that the first variable z; is already 
eliminated and we are left with a system of one equation in three variables 
Z2, 23 and z4. So we treat r3 and 24 as free variables. Let x3 := s and 
z4 := t. Then zo = 8+ (2s + 3t). Using this in the first equation, we 
get zr} = 7-48+t. Let S be the set of solutions for the above system. 
Then, S = {(7,8,0,0) + s(—4, 2, 1,0) + ¢(1,3,0,1) | s,t€R}. Note that 
(7,8,0,0) is a solution of the system. But (—4,2,1,0) and (1,3,0,1) are 
not. However (—4,2,1,0) and (1,3,0,1) are solutions of the associated 
homogeneous system. 


Example 1.2.5 Solve the system 


r+2y+z = 0 
y+2z = 0 
r+y-z = 0. 


Scanned by CamScanner 


Systems of Linear Equations 9 


Let S be the set of solutions for the above system. Then, 
S = {a(3, —2,1) | a € R}. 


Consider the homogeneous system )05_, aijzj = 0 forl <i< m. We 
would like to know when this system has non-trivial solutions. Before we 
prove a result in this direction, we look at some more examples. 


Example 1.2.6 Consider the system of two equations in two unknowns: 
32 + 4y =0 and c+y=0. The set of solutions is S = {(0,0)}. Thus this 
system has no non-trivial solutions. 


Example 1.2.7 For the system 3r+4y+z=0 and z+y+z=0, we have 
S = {a(-3, 2,1) | a € R}. 


Example 1.2.8 For the system z — y+ 4z = 4 and 22 + 6z = —2, we have 
S = {(-1, —5, 0) + a(—3, 1,1) | a € R}. 


We see that a homogeneous system need not always have non-trivial 
solutions. However, we observe that if the number of unknowns is more 
than the number of equations then the system always has a non-trivial 
solution. 

This is intuitively clear if we look at the geometric interpretation of the 
set of solutions. For example, the solutions of az + by = 0 are all points 
lying on the line determined by the given equation. 

Again, the system 


ayr+bhhy+cz 
aor + boy + coz 


| 
oo 


always has non-trivial solutions. These are two planes (or a single plane) 
passing through the origin, hence they intersect and the intersection is a 
line (or a plane) passing through the origin. 


Theorem 1.2.1 The system ae ajjz; =0 forl1 < i< m always has a 
non-trivial solution ifm <n. 


Proof We first prove the result form =1 andn>1: 

Q1121) +°°+ + QiyZ, = 0. 
If each a; = 0 then the equation is 0 = 0. Any n-tuple (z,...,2,) with 
x; € R is a solution. Assume therefore a,; # 0 for some j7. Then we can 


write 


=1 
@j = —a); (@y140y +28 * + 15-12 5-1 + 15412541 +++ + GinTn). 


Scanned by CamScanner 


| 


10 Systems of Linear Equations | 
| || 


Hence, if we arbitrarily chose a; € R, for all i 4 J and take 


ay = ay) (aqyay $9 + ayy-1aj-1 + Aaj4 AHI F** F AIndn) 
, ‘ . ali! a ~— 
then (a;,....a,) is a solution of Vi jar MT) = 0. Thus for m = ] and 


m > 1 we have a non-trivial solution. 

We prove the result by induction on m. As induction hypothesis, we 
assume that if we are given a system of (m — 1) equations in k variables 
with (m— 1) < &, there exists a non-trivial solution. We prove the result 
for m and n with m <n. 

Let Sy_, diyzj = 0 for 1 <1 < m be a system of m equations in pn 
unknowns with m <n. If each a;; = 0, as before the system is 0 = 0 for al} 
i and so any n-tuple (z),... ,Zp) is a solution. So assume that there exist 
(3,3) such that a;; #0. Let 


L) = 4%) +++++aintn = 0 
[Lp :=  Q9)2) +++'+QenTn = 0 
Lio :=  aygite'+@intn = 0 
, Lm := Qmizj++*'+QmnTn = 0 


Digression 

We work out the case (i,j) = (1,1) and m = 2, n = 3 so as to understand 
the ideas and not be overwhelmed by the complicated notation below. If 
a}, + 0, then we can write z) = —a;, (a1222 + a1323). We sabstitute this 
value of z; in L2 to get 


—ay1(az})(a1272 + 01323) + a2222 + a2323 = 0. 
Rearranging the terms, we get 
(ap + @91(—a;; )ar2)z2 + [a23 + a91(—aj;)ar3)z3 = 0. 


This is a homogeneous system of one equation in two unknowns z2 and 23 
and has a non-trivial solution by induction hypothesis. So, if (r2,23) 4 (0,0) 
is a non-trivial solution, it is easily verified that x, defined above along with 
Z2 and 23 satisfies the original system. 


End of Digression 


Since ajjZ, +--+ + GinZy = 0 and aj; # 0, we have 


= 40) gs acute 
Dj = Oy; (ayy ty Ho + Oi 5125-1 F Gi, 5412 541 + +++ + AinTn). 


Scanned by CamScanner 


Systems of Linear Equations 11 
cP hehe eee set Milan tet Al te Naor ~ a 


We substitute this value of z; in the equation L, = 0 fork # i. We get a 
new system of (m — 1) equations Li in (n — 1) variables 


Diyeee pLj-1yTjaiy--+ sTy 


as follows: Forl1<k<m,k #i, 


i= wre - a4 5;(—0;;")air]zy = 0. 
r#j 


Since m — 1 < n—1, by induction hypothesis on m, we get a non-trivial 


solution 21,.-- »Lj-1)2j+1)+++ 1 Zn Of this system. In particular, z, # 0 for 
some k + j. We take a; := 2; as above: 


-1 S 
Qj = —ai; ( oar). 


r#j 


We claim a1,... ,@j-1,@j,--. , Qn is a non-trivial solution. For 1 <k < m, 


LT. = >> AkrLy + AK; Qj 
rF#j 
oe AkrQy + ax ;(—a;;") > QisQs 
r#j s4i 
= S “lake + ax; (—a;;")airlor 
rF#j 
= £29 for k # i. 


By our very definition of aj, (a1,... , Qn) is a solution of L;: 


» QirQy > QirQr + (aij) ( —a;;) (= aisar) 


r#j rj 
= ) (air - air)ar = 0. 
rj 
(a1,...,@,) is non-trivial since a, # 0 for some k # j (given by the 


induction hypothesis). Thus ai,...,Qp is a non-trivial solution of the 
original system. 


0 


Exercise 1.2.3 Show that the solution set of the equation 7+ y—-—z=0 
can be written as {s(—1,1,0) +#(1,0,1) | s,t€R}. 


Scanned by CamScanner 


12 Systems of Linear Equaty 
tty 


Exercise 1.2.4 Find the points of intersection of 
vty? = 13 and 
3x27 +4y? = 48. 


Exercise 1.2.5 Let P(x) = ap+a,r+apx*. Choose a; such that P(1) by 
P(2) = bo, P(3) = b3. Is this choice unique? 


Exercise 1.2.6 Give a system of linear equations having 
(1) (1,0,0) as “the” only solution. 
(2) (1,0,0) and (0,1,0) as solutions. 
(3) (1,0,0), (0, 1,0) and (0, 0,1) as solutions. 


Scanned by CamScanner 


2. Vector Spaces 


2.1 Definition and Examples 


Do you recall the “addition” and “scalar multiplication” we defined on the 
space of solutions of a homogeneous system of linear equations? We define 
a vector space to be a set on which similar operations are defined. More 
precisely, we have 


Definition 2.1.1 A non-empty set V is said to be a vector space over 
R (or a real vector space) if there exist maps +: V x V — V, defined by 
(x,y) ++ z+y, called addition, and-: Rx V - V, defined by (a,z) 4 a-z, 
called scalar multiplication, satisfying the following properties: 


(i) c+y=y+t+z (commutativity of addition). 
(ii) (c+y) +z=2+(y+z) (associativity of addition). 


(iii) There exists 0 € V such that c+0 = c = 0+z2 (existence of additive 
identity). 


(iv) For every z € V there exists y € V such that r+y=0=y+2. This 
y is denoted by —z (existence of additive inverse). 


(v) a-(cx+y)=a-zrt+a-y. 
(vi) (a+ 8)-c=a-c+f-c. 
(vii) (a8) -2 = a(6 - 2). 
(viii) 1-2 =a. 


We adopt the following standard notation: z + (—y) is written as 2 — y 
for all z,y € V and for a € R and z € V we write az for a- x. Hereafter, 
by a vector space we mean a vector space over R. Elements of a vector 
space V are called vectors of V. The addition in a vector space is referred 
to as the vector addition. 0 is called the zero vector. z € V is a nonzero 
vector if z #0. A vector av is called a scalar multiple of v € V. 


13 


Scanned by CamScanner 


14 Vector Spaces 


Theorem 2.1.1 In a vector space V, we have 
(1) 0-2=0 forallzeV. 


(2) There is a unique additive identity. That is, if0 and 0! are such that 
r+0=2 andz+0'=2 forallz€V, then0=0'. 


(3) The additive inverse is unique. That is, if for a given x, there are 
yy’ €V such thatr+y=0 andz+y' =0, theny=y/. 


(4) (-1)-2 = —z, the negative element such thatr+(-z) =0 forzey, 
(5) a-0=0 forallaeR andOeV. 
(6) Ifa-r=0 foraeR andzeV, then eitherra=0 orrz=0. 
Proof (1) We have 
0-c=(0+0)-c=0-2+0-2 (2.1.1) 


by property (vi) in Definition 2.1.1. Adding —0- 2, an additive inverse of, 
0-<, which exists according to (iv), to both sides of Equation (2.1.1), we 
get 

0 0-2+(-0-z) by (iv) 
(0-2+0-z)+(-0-z) by Equation (2.1.1) 
0-c+(0-2+(-0-2)) by (ii) 
0-7+0 by (iv) 
0-2 by (iii). 


(2) Assume that 7 +0 =2=2+0! for all z € V. In particular 0+ 0’ =0 
since 0' is an additive identity. Similarly, 0+ 0’ = 0' since 0 is an additive 
identity. Hence 0 = 0'. 


(3) Assume that r+y =0=2+y'. We add y’ to all the sides of this 
equation, use commutativity and associativity of the addition: 


oO 


rt+y 
y +(z+y) 
(y'+z)+y 
(r+y')+y 
O+y 
y 


+ + 
oOo Oo 


PoUdY 


(4) (-1)-a+2=(-l):¢+1-2= (-1+1):2 =0:2 = 0 50 that 
(-1):2=-a, 


Scanned by CamScanner 


Definition and Examples 15 


(5) We have 0 + 0 = 0 from (iii). Hence a-0 =a-(0+0)=a-0+a-0 
by (v). Adding —a- 0 to both sides and using (iii) and (ii), we get 


0 


a-0+(-a-0) 
(a-0+a-0)+(-a-0) 
a:0+(a-0+(-a-0)) 
a-0+0 

= a0. 


(6) If az = 0 and a £0, then we multiply both sides of az = 0 by a“? to 
get 

a (az) =a) -0=(a7!a)-tz=a7!-0. 
The extreme left term of this equation is z by (vii) and (viii), and the 
extreme right term is 0 by (5). 


D 
Remark 2.1.1 For every z € V there is a unique y € V such that r+y = 0. 
This y is given by y = (—1)z so that —z = (-1)z. 
Remark 2.1.2 We shall denote the additive identity 0 by 0 henceforth. 
The context will make it clear whether 0 denotes the real number zero or 
the additive identity 0. The reader is urged to go through the above proof 
using the same symbol 0 for both and try to understand the proof. 


We now look at some examples of vector spaces. 


Example 2.1.1 Let X be a non-empty set. Let 
V=F(X,R) ={f:X RB} 


be the set of real valued functions on the set X. For f,g € V, we wish to 
define f +g € V. Thus f +g must bea function from X to R. So to define 
f +g it is enough if we say what its value at any arbitrary point z € X 
is. We define (f + g)(x) := f(x) + 9(z). Similar considerations suggest the 
definition of af € V, fora € Rand f € V, as the function whose value at 
z is given by (af)(z) = af(x). Then V is a vector space over R. 


Example 2.1.2 Let X be any nonempty set and let Fo(X,R) denote the 
set of functions from X to R such that the set {cr ¢ X | f(r) £0} is 
finite (this set may depend on f). Thus, f € Fo(X,R) if and only if 
f(x) = 0 except for finitely many z € X. Clearly, Fo(X,R) is a subset 
of F of Example 2.1.1. We define addition and scalar multiplication as 
earlier: f + g and af are elements of Fo(X,R) whose values at z € X are 
given by (f + g)(x) := f(x) + g(x) and (af)(z) = af(z). Note that f +g 
and af lie in Fo(X,R), that is, the sets {2 € X | (f + 9)(x) #0} and 
{z EX | af(z) 40} are finite subsets of X. 


Scanned by CamScanner 


RON a IT 
oe, oil 


16 Vector Spaces 


Example 2.1.3 Let V = R" = {(a1,.+. yn) | ti € R}. For 


c= RP ipes Gu); y= (¥1)-+ Yn) 


define 
T+y:= (21 + Y1)-:- s2n + Yn) 


and 
art = (az, see QT») 


for a € R. Then V is a vector space under these operations. In particular, 
R! = R is a vector space over R. Note that the addition and scalar multi. 
plication on R” are the same as the ones we had defined for solutions of a 
homogeneous system of linear equations. 

This example is special case of Example 2.1.1: Take X = {1,... ,n} ang 
define f : X 3 R by 2; = f(i) for 1 <i <n. Then the map 


T: f+ (f(1),--- f(r) 


is a bijection of F(X,R) and R". What is interesting is the fact that the 
way addition (respectively scalar multiplication) on F(X,R) corresponds 
tu that on R” under this bijection. 


Geometric interpretation of vector addition in R? 


Most often we look at R, R? and R® to understand the geometric meaning 
underlying the concepts. In such attempts, we shall assume some very 
basic knowledge of analytic geometry. The first in this direction is the 
geometric interpretation of addition of two vectors in R?. Let z,y € R? j 
where z = (z1,22), y = (yi,y2). Then z+y = (r1 + 41,272 + yo). In 
Figure 2.1.1, A = (z1,22), B = (yi, y2). Form the parallelogram with sides) 
OA and OB. Let the fourth vertex be C with coordinates (a, 3). 

Let M be the point of intersection of the diagonals of the parallelogram 
OACB. Then M is the midpoint of BA as well as OC. Since M is the 
midpoint of BA, 


] +41 22+ ye 
telpad | oe 
f 5( + A) ( , 9 ). 


Since M is the midpoint of OC, we have M = 3[(0,0)+(a, 8)]. Comparing 
coordinates we get au = $ and BB = $ ora = (2; + yy) and 
6 = 22+ y2, that is, C= A+B. Thus z+ y is the vector represented by 


the diagonal of the parallelogram spanned by z and y. 


Scanned by CamScanner 


Definition ard Examples 17 
_—_—_———————' rhrv —O—__ 


Figure 2.1.1 Addition of vectors. 


Exercise 2.1.1 Let V := S := {(an) | 2, € R} be the set of all real 
sequences. Then V is a vector space over R under the following operations: 


(tn) + (yn) := (tn + Yn) 
a(2n) (azn). 


Note that again this is a special case of Example 2.1.1 if we take X = N. 


Exercise 2.1.2 Let C be the set of all convergent real sequences. Note 
that C is a subset of S of Exercise 2.1.1. We define the addition and scalar 
multiplication as in Exercise 2.1.1. Then C is a vector space. The subtle 
point of this assertion is that we have to show that if (z,) € C and (y,) € C 
then (z,) + (yn) lies in C — in other words, (z, + yn) is convergent if (z,) 
and (y,) are so. A similar fact is needed for a(z,). These are well-known 
facts from Analysis. Thus to show that C is a vector space we need results 
from analysis! 


Exercise 2.1.3 Let Co be the set of null sequences, that is, 
{(zn) | limz, = 0}. 


Note that Cg C C C S. Then Co is a vector space under the same opera- 
tions as in Exercise 2.1.1. 


Exercise 2.1.4 Let V := C((a,b]), be the set of all real valued continuous 
functions on [a,b]. Note that this is a subset of F(a, 6], R) of Example 2.1.1. 
We define the addition and scalar multiplication as in Example 2.1.1. Again 


Scanned by CamScanner 


| 


18 Vector Space, 
ee a 


the crucial steps are to show that f +g and af are in V. That is, the 
are continuous if f and g are so. This follows from analysis. Then V jg a 
Vector space over R. 


Exercise 2.1.5 The set D((0,1]) of differentiable functions on (0, l) isa 
Subset of C({0, 1]). It is a vector space under the addition and scalar myjtj, 
plication of functions as in Example 2.1.1. You need analysis to prove this 
assertion. 


Exercise 2.1.6 Let V := R([a,6]), be the set of all Riemann integrable 
functions on /a,6]. This is a subset of F((a,b],R) and we define addition 
and scalar multiplication in a way which is by now familiar to you. Again 
you need analysis to show that this is indeed a vector space. 


Exercise 2.1.7 Recall that a function f: R > Ris called even (respective) 
odd) if f(-—z) = f(z) for all z € R (respectively f(—z) = —f(z) forze¢ R), 
Let F.(R.R) (respectively, F_(R,R)) denote the set of even (respectively 
odd) functions from R to R. Are they vector spaces under the obvious 
definitions? 

Exercise 2.1.8 Let V = P:= {Diu paiX' | ai €R, n EN}, be the set 
of all polynomials in one variable with real coefficients. The addition and 
scalar multiplication are the usual ones: 


(Sax) wr = )o(a, + by) X7 
j=0 


i=0 r 


where a, =O ifr > mand b,=0ifr>n. Also, 


a (x a = ae 


Then V is a vector space over K 


Exercise 2.1.9 Let V = Pa = { Dyno aiX" | a; € R}, be the set of all 
polynomials of degree < n with real coefficients. This is a subset of P of 
Exercise 2.1.8. So we define the operations as in P. Then V is a vector 
space over R 


Exercise 2.1.10 Let V denote the set of all polynomials exactly of degree. 
n. Is it a vector space under the usual addition and scalar multiplication 
of polynomials? 


Exercise 2.1.11 Let V be the set of all solutions of a system of m homo- 
geneous linear equations in n variables with real coefficients. Then V is a 
vector space over R under obvious operations. Can you realize this as a 
subset of R*? 


a a es — 


Scanned by CamScanner 


Definition and Examples 19 


Exercise 2.1.12 The set M,,,m(R) of all n x m matrices with real entries 

is a vector space over R with the operation of addition of matrices and scalar 

multiplication of matrices: (a;j) + (bij) := (aij + bij) and a(ai;) := (aaj;). 
Can you set up a bijection from Myxm(R) onto F(X,R) where 


A {Typos yn} % Tos. 97 
We let M(n,R) denote the set of n x n matrices with real entries. 


Exercise 2.1.13 Recall that a real n x n matrix A = (a;;) is said to be 
symmetric if aj; = aji for alll <i,j7 <n. Let S,, denote the set of allnxn 
symmetric real matrices. Then under the operations of matrix addition and 
scalar multiplication as in Exercise 2.1.12, S,, is a vector space. 


Exercise 2.1.14 A real matrix A = (a;;), 1 < i,j <n is skew-symmetric 
if ayy = —Qji for all 1 < i,7 < n. If A, denotes the set of all skew- 
symmetric matrices, then A, is a vector space under obvious addition and 
scalar multiplication. Note that both S, and A, are subsets of M(n,R). 


Exercise 2.1.15 Let V, W be vector spaces. Let us form the Cartesian 
product V x W. Define addition and scalar multiplication on V x W as 
follows: 


(v1, wi) + (v2, we) (vj +v2,Wi+we), (vji,wi)eEVxW, i=1,2 
a(v, w) = (av,aw), aéER, (v,pw)EVxW. 


Then V x W is a vector space. This vector space is usually denoted by 
V @W and called direct sum of V and W. 


Exercise 2.1.16 Extend the construction in Exercise 2.1.15 to define the 
direct sum Vj @:--@ V, of n vector spaces V;, 1 < i < n. Do you recognize 
R@::-@R? 


Exercise 2.1.17 Let V be a vector space. On V x V, define +, and - as 
follows: 


(1) (v),wW1)+(v2,we) = (vy + we, Ww) + v2) 

a(v,w) = (av,aw) ae R (vw) eVxV 
(2) (v1, wi) +(v2,we) = (vy + wy, v2 + w2) 

a(v,w) = (av,aw). 


Then V x V is not a vector space as the addition violates some of the 
conditions (i) — (iv) in Definition 2.1.1. 


Scanned by CamScanner 


20 Vector Space, 
ee 


| 
Exercise 2.1.18 Let V be a vector space and X be any (non-empty) bet 
Let W be the set of functions f : X + V. On W define addition and SCalay 
multiplication as follows: 


(f + 9)(z) f(z)+g(z), f.geW, zEx 
(a-f)(z) = af(z), a €R, rEx 


Then W is a vector space. 
Note the similarity of this exercise with that of Example 2.1.1. 


Exercise 2.1.19 Let X := {+} be a singleton set and let V be a vectg, 
space. Let W = {+} x V. We can turn W into a vector space as follows. 


(*,u;)+(%,u2) = (%,0: +02), 11,02EV 
v a(+.v) = (+,av), aéeR, vey. 


Exercise 2.1.20 Let V :=Q. On Q we have a natural addition, namely, 
the addition of rational numbers. However, if a € R is irrational andr € Q 
then ar < R but not in Q. Then V is not a vector space over R. 


Exercise 2.1.21 Let C denote the set of complex numbers: 
C:={z=2+iy | z,yeR}. 
i We identify RK as a subset of C consisting of the complex numbers whose 
imaginary part is zero. Recall the addition of complex numbers: 
(xz +iy) + (u+iv) :=(r+u)+i(y+v) 
and the multiplication of complex numbers: 
(a + ib)(z + iy) := (ax — by) + i(ay + bz). 


We turn C into a vector space over R by declaring vector addition the same 
as addition of complex numbers as above and the scalar multiplication a-z 
is the multiplication of complex numbers a and z: a-z:= az + iay where 


z:= 2+4iy. Under these operations, C becomes a vector space over R. 
This follows from the commutativity, associativity and distributivity of the 
operations in C. 


Exercise 2.1.22 Let P;, 1 < i <n be continuous functions on [{a, 6] C R. 
Let V be the set of n-times continuously differentiable solutions f on {a, }] 
of a linear differential equation 


y) + Py(z)y*") +... + Pa(z)y = 0. 
If f and g are solutions of the differential equation, we let 
(f + 9)(z) = f(z) + g(x) and a: f(z) = af(z) 
for x € [a,b] and f € V. Then V is a vector space. 


=e ee _ 
Scanned by CamScanner 


The rest of the section may be omitted in the first reading. 


Remark 2.1.3 One can define vector spaces over C: Scalar multiplication 
now will involve complex numbers. In Definition 2.1.1, if we replace BR by C 
then what we get is called a vector space over C or a complex vector space. 
Examples are obtained from sorne of our earlier examples (and exercises), 
by replacing R with C. For instance, if we consider the set of complex 
valued functions F(X,C) on a nonempty set X with the operations 


(f + 9)(z) = f(z) + 9(z), zExX, f,geF(X,S 
(af)(z) = e@f(z), zEX, fEF(X,C), «€C, 


then F(X,C) is a complex vector space. One can similarly construct 
complex vector spaces C”, M(n,C) etc. 


Remark 2.1.4 One can similarly replace R by Q in Definition 2.1.1 and 
get vector spaces over Q. Can you think of vector spaces over Q? Do you 
see that any vector space over K is a vector space over Q? Can you think 
of a vector space over Q which is a not a vector space over R? Replace 
the pair (Q,R) by the pair (R,C) in the above questions and answer them. 
Could we have replaced Q by C in the original set of questions? 


2.2 Vector Subspaces 


An astute reader would have noticed something very striking in our list 
of exercises of vector spaces in Section 2.1. We seem to have basically a 
few vector spaces and the rest were subsets of them. For example, the 
solution sets of a homogeneous system of m equations (with real coeffi- 
cients) in n variables is a subset of R" (see Exercise 2.1.11). Similarly, the 
vector spaces C({a,],R) of Exercise 2.1.4, R([a, b],R) of Exercise 2.1.6 and 
D[0, 1] of Exercise 2.1.5 are subsets of the same vector space F((a, 6], R). 
Furthermore, from analysis, we know that C({a,],R) C R([a,b],R) and 
D([0,1],R) c C([0,1],R). Again, S, of Exercise 2.1.13 and A, of Exer- 
cise 2.1.14 are subsets of M(n,R). Moreover, addition and scalar multi- 
plications on the subsets were the same as the ones on the bigger set. To 
put it differently, these subsets enjoy the property that whenever we add 
any two elements of the subset we again get an element of the subset and a 
scalar multiple of an element of the subset lies again in the subset. One says 
that the subset is closed under the vector addition and scalar multiplication. 
These observations suggest the following definition. 

A vector subspace W of a vector space V over R is any non-empty subset 
W CV which is closed under the addition and scalar multiplication on V. 


———— es | 


Scanned by CamScanner 


oe Vector Spacey 


More precisely, we have the following definition: 


Definition 2.2.1 Let W be a non-empty subset of a vector space V, T e 
W is said to be a vector subspace (or simply a subspace) of V if W itself jg 
a vector space under the operations induced from V. That is, 


(i) OE W. 
(ii) If wy, we € W, then w; + w2 € W. 
(iii) a € Rand w € W, then aw € W. 


Exercise 2.2.1 Show that the following subsets W are vector subspaces 
of V: 


(1) W =C((0,1)) and V = F((0, 1], R). 
(2) W =C((0,1]) and V = R((0, 1], R). 
(3) W = D([0,1]) and V = C((0, 1), R). 
(4) W=Cand V=S. 

(5) W=S, and V = M(n,R). 


Exercise 2.2.2 Show that the set {0} consisting of the zero vector in any 
vector space is a vector subspace. 


Exercise 2.2.3 Find some more examples of vector spaces and vector sub- 
spaces from our list in the last section. 


Exercise 2.2.4 Fix 29 € X. Let S={f:X—4R | f(zo) =0}. Then § 
is a vector subspace of #(X,R). | 


We now want to address the following problem: How does one “create" 
vector subspaces out of a given vector space? We start with the simplest 
case. | 

Let V be a vector space. Let v € V. We want a vector subspace Vp ol 
V which contains v. We can take Vo = V! So what we want is a vecto 
subspace Vp containing v which is as “small” as possible. If such a Vo exists 
since v € Vo, all scalar multiples av € Vo for anya € R. As | 

av + Bu =(a+B)v, -v = (-1)v | 
we see that if we take Vy) = {av | a € R}, then V is a vector subspace 
containing v. Let us make sure that we understand this. If z = av € Vi 


and y = Bu € Vo, then, what we are supposed to show is that z + y € Vo 
But c+ y = av + Bu = (a+ B)v = yu € Vo where y =a + B. Similarly 


Scanned by CamScanner 


Vector Subspaces 23 


we show that if v € Vo, av € Vo for any aE R. Also, Vp is the smallest in 
the sense that if W is a vector subspace containing v, then W D Vo. This 
is clear: Whenever v € W, since W is a vector subspace, av € W for any 
a éR. Thus any arbitrary element of Vo lies in W. Hence Vy C W. Vp is 
usually denoted by Rv. 

It is worth going through the last paragraph once again as it forms the 
heart of the matter to come. Before we go any further let us look at the 
geometric meaning of Vo. Let uv := (r,s) € R? be a nonzero vector in R?. 
Then the smallest subspace Vo containing v is the set 


{tv | t€R}=({(tr,ts) | te R}. 


This is nothing but the line through the origin and the point (r,s) in R?. Do 
you see this? The line joining the origin and (r,s) is given by the equation 


e-0 y-0 
O-r O-s 


Hence any point of this line is given by (tr, ts) for some t € R. 


Exercise 2.2.6 What is the geometric object corresponding to the 
smallest subspace Vo containing a nonzero vector v = (r,s,t) € R°? 


Now we let v,w € V and ask for the smallest vector subspace Vp 
containing v and w. As earlier, av € Vo and Bw € \% for anya,BeER 
Since Vp is a vector subspace, au + Bw € Vo. This suggests taking 


Vo={av+Bhw | aE€R SER}. 
Notice that fora =1, 8 =0,1u+0w=ve VW. Similarly, 
w=0-v+l-weh. 


One easily shows that Vo is a vector subspace of V containing v and w. 
Moreover it is the smallest vector subspace containing both v and w. 
Exercise 2.2.6 Prove the last two assertions of the last paragraph. 


Exercise 2.2.7 Let v = (r,s,t) € R® and w = (a,b,c) € R® be two nonzero 
vectors. Show that the smallest vector subspace of R® containing v and w 
is either 


(i) a plane containing these points and the origin, or 
(ii) a line passing through these points and the origin. 


In the latter case, one of them is a scalar multiple of the other: Either 
v= aw or w = Bu for some a, BER. 


Scanned by CamScanner 


24 Vector Spacey 


More generally, we have the following exercise: 


Exercise 2.2.8 If S = {v,,... , Ux} isa subset of a vector space V, arguing 
as above, show that 


Vo = {Qqt) $25+ + QKUp | Q1)++- Qk € R} 


is the smallest vector subspace containing S, that is, a vector subspace 
containing S and if IV is a vector subspace containing S, then Vo c w. 


This suggests the following definition: 


Definition 2.2.2 Given {v;}§,, a finite linear combination of v; jg || 
vector of the form eae ajv;, with a; € R. 


"A Thus Exercise 2.2.8 can be reformulated using this definition as follows: 
The set of all finite linear combinations of 1),... , vz is the smallest vector 
subspace containing v),... , Uk. 

Now if S is any subset of V, we let L(S) be the set of all finite linear 
combinations of elements of S. Thus 


k 
1(S) = {Soo 
i=1 


Note here that k, aj, vj are all arbitrarily chosen from their respective 
domains. Then L(S) is the smallest vector subspace of V containing the 
given set S. L(S) is also called the linear span of S and denoted by Span (S), 


ke N,v; eS.a€R) 


Example 2.2.1 Let S = {v} C V for some v € V. Then 
L(S) =Rv:={av | a€R}. 


What is L(S) if S := {e1 — e2,€1 + €2}, where e; = (1,0) and e2 = (0,1) in 
V=R? 

What is L(S) if S = {e1,€2,e1 + e2} in R® where e; = (1,0,0) and 
2 = (0, 1,0)? 


Definition 2.2.3 We say a vector subspace Vo of V is generated by the 
subset S Cc V if the smallest vector subspace containing S is Vo, that is, a 
vector subspace Vo of V such that 


(i) S$ C Vo, and 


(ii) if W is any vector subspace such that S C W then Vo Cc W. 
In such a case, we denote Vo by < S >. 


Scanned by CamScanner 


Vector Subspaces 25 


Pxercise 2.2.9 If S = {v,...,u} C V, then the vector subspace < S > 
generated by S is precisely L(5). 


Remark 2.2.1 Thus for a subset S C V, we have 
L(S) = Span(S) =< S>. 


We shall use these interchangeably. 


Let us address another problem. Can it happen that L(S’) = L(S) for 
subsets S’ C S C V? Let us again look at the simplest case. Let S’ = {v} 
and S = {v,w}. The question, therefore, is: When is L({v}) = L({v, w})? 
If equality holds, then, w € L({v}). But we know that any element of this 
latter set is of the form av for some a € R. Hence we see that w = av for 
some a € R. Conversely, if w = av, then L({v}) = L({v, w}). In the same 
vein, we can solve the following exercise: 


Exercise 2.2.10 Let v and {v;}"_,, be vectors in a vector space V. Let 
S' = {v;}", and S = {v} US’. Then L(S’) = L(S) if and only if there 
exist scalars a; € R, 1 <i <n, such that v= SO", ajuj. 

In particular, we see that v € L({u,,... ,vx}) if and only if 


L({v, eee , Uk}) = L({v, Ujyeoee ) UK}). 


Exercise 2.2.10 motivates the following definition: 


Definition 2.2.4 Let v and {v;}*_, be vectors in V. uv € V is linearly 
dependent on v,... , Uz if and only if there exist a,,... ,a, € R such that 
v = )0;~, aivj. We express this by saying that v is a linear combination of 
VUly+++ y Uke 


We want to look at this from a slightly different point of view. Suppose 
we are given that v is linearly dependent on v,... , ug. Let {y1,..- , yeoi} 
be a different labelling of v,vj,...,ux. We want to say {y1,-..,yk+i} is 
a linearly dependent set. What do we mean by this? We mean that there 
exists one element among y1,..- ; Y¥k+1) Say yj, Which is a linear combination 


Of Yay Yj—1Yj4ty-++ Yar That is, 


k+1 


w= >_ ain. 
j=l 
j#i 
This suggests the following definition: 


Definition 2.2.5 We say {v1,...,Un} are linearly dependent if there exists 
ai, 1<i<n, not all zero such that )\y_, aiv; = 0. 


Dn eee eee 
Scanned by CamScanner 


26 Vector Space, 
0 Ee 


Exercise 2.2.11 Definition 2.2.5 is equivalent to the following one (with 
which we started): {2),...,2n} is linearly dependent if and only if therg 
exists an x; which is a linear combination of 2), for i # j, that is, which is 
a linear combination of the other elements. For, if ay #0, then 


Qjzj = - yaa 


ify 


rj = -a;" Soa. 
Pa i#j 


so that 


What is the geometric meaning underlying linear dependence of vectors? 
- Let us look at R? and R® to answer this question. If {v,w} is a linearly 
dependent set in R?, then one of them is a linear combination of the othe; 
— here it simply means that one is a linear multiple of the other, say 
w = av. Thus w lies on the line joining 0 and v. (Do you see the genesis 
of the word “linear”?) If {u,v,w} is a linearly dependent set in R°, then, 
say w is a linear combination of u and v: w = au+ Gv. I claim that thig 
means that w lies on any plane containing the points u, v and the Origin, 
Now any plane passing through the origin is given by an equation of the 
form 


az + by +cz =0, (a, b, c) # (0, 0, 0). (2.2.1) 


Let u = (2;,41,21) and v = (£2, 42,22). They lie on the plane given by 
Equation (2.2.1) if and only if az; + by + cz; = 0 fori = 1,2. But thig 
is a homogeneous system of two equations in three unknowns a, b and 
c. Hence by Theorem 1.2.1, it has a non-trivial solution which we again 
denote by (a,b,c). Now, if (xi,yi, 2) is a solution of the homogeneous 
Equation (2.2.1), so are a(z1, 41,21) and G(z2, y2, 22). Hence 


a(21, 41,21) + (22, ya; 22) 


is a solution of Equation (2.2.1). But this simply says that au + Gv lies on 
the plane given by Equation (2.2.1). | 


Exercise 2.2.12 Find whether v is a linear combination of v;’s in the 
following cases. 


(1) v = (a,b), 1) = (1,0) and v2 = (1, 1) in rR, 
(2) v =(0,0,1), vy = (1,0,1) and v2 = (0,1, 1). 


(3) v = (21,22,23,1), 2; € R arbitrary, v) = (1,2,3,0), v2 = (2,3, 1,0) 
and U3 = (3, 2, 1,0). 


Can you generalize this? 


Scanned by CamScanner 


Vector Subspaces 27 


pxercise 2.2.13 Let V = Py, be the vector space of polynomials of degree 
jess than or equal to n. Describe the set L({z?4+2+1,z}) and L({z?+1,z}). 


Show that they are the same. Find which of the following polynomials lie 
in L({z? +x + 1,1}): 


(1) 22? + 52+ 7. 
(2) 10?x? + 10z + 10. 
(3) ex? +0-x +e, where ¢ is the usual base of the natural logarithm. 


Exercise 2.2.14 Let v € R® and S = {(1,0,1),(1,2,-1)}. Give a 
geometric description of L(S). 


Exercise 2.2.15 If V = R° and S = {v, w}, find the set of linear equations 
which define the geometric object L(S). 


Exercise 2.2.16 Let V = R" and S = {e,,... ,ex}, 1< k <n, where 
er —(0;;;. 30,150,220); L<jek 

(1 at the jth place). What is L(S)? 

Exercise 2.2.17 If S is a vector subspace of V, what is L(S)? 


Exercise 2.2.18 The set of solutions S of ar+by+cz = 0 for (a,b,c) #0 
is a vector subspace of R’. S is the plane through the origin with normal 
(a, b,c). 


Definition 2.2.6 If A and B are nonempty subsets of a vector space V, 
we denote by A + B the subset {a+b | ae A,beE B}. 


Exercise 2.2.19 Let the notation be as in Definition 2.2.6. Show that 
L({v, w}) = Rv + Rw. More generally, show that L(S) = Ru, +---+ Ru, 
i>= eee Ug}. 


Exercise 2.2.20 Given W,, W2 vector subspaces of V, does there exist 
any smallest vector subspace W3 containing W, and ‘W.? 


Exercise 2.2.21 Let W be a vector subspace of V. What is w+ W if 
we W? What is W + W? Is it true that w+ W = W if and only if 
wew? 


Exercise 2.2.22 If W,, W2 are vector subspaces of a vector space V, then 
W, + W2 is a vector subspace of V. What is W, + W2 if Wy = We? More 
generally, what is W, + W2 if Wi C W? 


Exercise 2.2.23 Let Wj, 1 <7 < 2, be vector subspaces of a vector space 
V. When is W, U W2 a subspace of V? 


Scanned by CamScanner 


28 Vector Spacey 
a een TS 


Exercise 2.2.24 Let V’ be the set of Cauchy sequences in R. Let W be 
the set of convergent sequences in R. For 2 = (an), ¥ = (Yn) € V, we define 
r+y:= (2, + y,) and az = (az,). Then V is a vector space and W jg 
a vector subspace of V over R. Is W a proper subset of V? (You neeq 
analysis to answer this!) 


Exercise 2.2.25 What are all the vector subspaces of R, R?, R°? What 
are their geometric descriptions? 


Exercise 2.2.26 If Wj, 1 < i < 2, are vector subspaces of a vector SPace 
V, then W917 is a vector subspace. 


Exercise 2.2.27 If {Wi}ier is a family of vector subspaces of V indexeq 
by a set J, then NicsIV; is a vector subspace. 


Exercise 2.2.28 If S is an arbitrary subset of a vector space V, then 

r L(S) =< S$ >= Span(S) is the intersection of all vector subspaces 
r containing S. 

\ meta exercise Can you identify the arguments in the foregoing Places. 


where our themes appeared? | 


2.3 Basis and Dimension of a Vector Space 


The beginning of this section is a repetition of what we have seen in. 
Section 2.2. These definitions are repeated here since they introduce two. 
of the most important concepts in linear algebra. 


Definition 2.3.1 A vector v € V is said to be a linear combination of 
vectors v1,... , U; if there exists a; € R such that v= )0;_, ajuj. | 


Definition 2.3.2 A set S = {v,,..., Un} in a vector space V is said to be. 
linearly dependent if there exists a; € R, 1 <i <n, not all a;’s zero, such 
that )-7_, aiv; = 0. | 
S is said to be linearly independent if it is not linearly dependent. In 
other words, if 7, aiv; = 0 then a; = 0 for all 1 <i <n. (Can you 
convince yourself of this? Most often this is the formulation which is used.) | 


Exercise 2.3.1 If {v,..., Un} is linearly dependent, then there exists 7 | 
such that v; = Didi a;v;, for some a; € R. 


Exercise 2.3.2 If 0 € S then S is linearly dependent. 


Exercise 2.3.3 {v,w} C V is linearly dependent if and only if one is a 
scalar multiple of the other. 


Exercise 2.3.4 The vectors (a,b) and (c,d) in R? are linearly independent 
if and only if ad — bc £0. 


Scanned by CamScanner 


Basis and Dimension of a Vector Space 29 


Exercise 2.3.5 Show that if v and w are linearly independent vectors in 
V, then so are v + w and v — w. 


Exercise 2.3.6 Let 5; be a linearly dependent subset of a vector space V 
and S52 be such that S2 C S;. Then prove that S> is linearly dependent. 
State and prove a similar property for linear independence. 


Theorem 2.3.1 Let V be a vector space. Then {14,...,U,} is linearly 
dependent if and only if one of the v;’s is a linear combination of the other 
Vj ‘s. 


Proof Since {v,...,Un} is linearly dependent there exists a; € R, not 
all zero, such that }c\_, aiv; = 0. Suppose a; # 0 for some i. Then 
QU) + ++s + {Ui +++ + OnUn = 0. Hence vj = —a5 (Ds 4s @;v;) and 
therefore vj = ));4;8;v; where 8; = -ay'a; € R Hence »; is a linear 
combination of the other v;’s. 

We now prove the converse. That is, if for some i, v; is expressed as a 
linear combination of v;, j # i, then {v),... , Un} is a linearly dependent 
set. Suppose vj = });4; @;0j;. Then we have 


Oyu, +++ + (—1)uj + Qi41 ig. +°°+ + OnUn = 0. 


That is, there exists a1,... , Qn, with aj = —1 # Osuch that >, ajv; = 0. 
Hence {v1,... , Un} is linearly dependent. 


0 


Remark 2.3.1 The above theorem does not say the following. If {v;}"_, is 
linearly dependent, then any v; is a linear combination of other v;’s, j # i. 
For instance, let us consider V = R? and e; := (1,0) and eo := (0,1). Then 
{e1, €2, 2e,} is obviously a linearly dependent set as 


—2e, + Oeo + 2e; = 0. 


But e2 is not a linear combination of {e1, 2e,}. (Prove this. Or, see Exer- 
cise 2.2.12 (3).) 


Definition 2.3.3 A set B = {e,,... ,en} ina vector space V is said to bea 
basis of V if every vector v € V can be expressed uniquely as v = )>"_, aye; 

. . n n t=] “t\?t) 
where a; € R. That is, if v = }o_, aves = DOj_, Biex then a; = B; for 
154s ; 


We now look at some elementary examples. We shall have a lot more 
examples in the form of exercises later. 


Scanned by CamScanner 


30 Vector Space 


NE a | 


Example 2.3.1 Let V:= R". If r= (1),... an) € R", we call x5, th 
jth coordinate of x. Let e; := (0,... ,0,1,0,... ,0) be the vector whose j¢ 
coordinate is zero unless j = i in which case it is 1. It is easy to show the 
{e; | 1<i<n}isa basis of V. This is called the standard basis of Rr 
In the sequel, when we write e; € R” it refers to the basis vector as aboy 


Example 2.3.2 Let V := M(n,R). Let Ej; be the element of V whos 
(?,j)th entry is one and the rest are zero. In notation, if Ei; = (zs), y 
have 


f — 0, ifrfiands#j 
7 l, iff=tands=}. 

Then {£,;; | 1<i,j <n} isa basis for M(n,R). | 

Example 2.3.3 Let V be the vector space of polynomials in the variable ; 


of degree less than or equal to n,n € N. Then the set {X* | 0<k<n 
is a basis of V. 


Example 2.3.4 Let V := R?. Let 


v} €)+e2=(1,1), and 


V2 := e€;—e€2=(1,-1). 


We claim that {v;,v2} is a basis of R?. We are supposed to establish th 
following: 
(i) any vector v = (z, y) is a linear combination of v;, and 
(ii) this linear combination is unique. 
To prove (i), let us assume that v = (z,y) = av) + Bv2. Then we have 
(2,4) = a(1,1) + 8(1,-1) = (0,2) + (4,-8) = (0+ B,a— 8). 


Consequently, we see that z= a+ and y=a—. Hence a = (x + y)/ 
and § = (x — y)/2. Thus, we have 


rt+y z-y 
(x,y) = > 4 + ~? 


The above argument also shows the uniqueness of this expression. 
Exercise 2.3.7 Show that {(1, 2), (4,3)} is a basis of R?. 
Exercise 2.3.8 When is {v, w} a basis of R?? 


Exercise 2.3.9 Show that {X,3X7,5 +X} is a basis of Pp. What aboy 
(9X, X? — 3X,2X?}? 


Scanned by CamScanner 


Basis and Dimension of a Vector Space 31 


Exercise 2.3.10 Show that {1,(X —a),(X —a)*,... ,(X —a)"} is a basis 
of P, for all a € R". Hints (for three different proofs): (i) Expand 
X* = ((X - a) + )* by binomial theorem. (ii) Use induction to show 
that X* is a linear combination of (X — a)? for 0 <j <k. (iii) Use Taylor 
expansion of the function f(X) = X” at the point _X =a. This allows you 
to directly exhibit the coefficients in the linear combination. 


Remark 2.3.2 In view of Example 2.3.1, Example 2.3.4 and Exercise 2.3.8, 
it should be clear that the basis is not unique for a vector space. 


Exercise 2.3.11 Fix a basis {v,...,un} of V. Let v = )°7_, zu; and 
w = DL, yivi- Show that v+w = _, (zit+yi)vj and av = Yo", (az;)-v; 
are the unique expressions of v + w and av in terms of the basic vectors 
v;. While writing a formal proof, see which of the properties in Definition 
2.1.1 of a vector space are used. 


Lemma 2.3.2 Any basis {e;} of V is a linearly independent set. 


Proof Ifa basis {e;} is linearly dependent, then there exist scalars a; € R, - 
not all zero, such that 5° a;e; = 0. However, we have 0 = 0-e; +---+0-e,. 

Since {e;} is a basis, by uniqueness of coefficients, we deduce a; = 0 for all 

i. This contradicts our assumption on a;’s. 


Oo 


Theorem 2.3.3 If a vector space V has a basis of n elements then any set 
of n+ 1 vectors is linearly dependent. 


Proof Let {e1,...,én} be a basis of V and let {v),... ,unzi} be any set 
of n+ 1 vectors. Since {e1,... ,€n} is a basis of V there exists azjé R 
such that vj = BS 1 ij3@;, 1 <i <n+l1. To show that {v;}?27) is linearly 
dependent, we need to find 6; € R, not all zero, such that 


n+l 


>> Bivi = 0. (2.3.1) 
i=1 


That is; ©o Bi(S°5=1 01je;) = 0. This means that we need to find 
{G;}"4) such that 


n n+l 
Ds (S As) e; = 0, (2.3.2) 


j=l 


is the zero vector. Since {e;}7_, is a basis, 0 = Oe; +--+ Oe, is the 
unique representation of 0. Hence equating the oa of e;, we get 
Sede, Biaij = 0,1 <9 <n. This is a system of n equations in n + 1 un- 
knowns §;. By Theorem 1.2.1, there exists a non-trivial solution, which we 


Scanned by CamScanner 


32 Vector Space, 
a a | 


denote by {6;,...,Bn4i}- Thus there exist scalars Bi, lStSn+] Such 
that Equation (2.3.2) is satisfied. Consequently, {v1,-++) Unga} is linearly 
dependent. 


Q 
Corollary 2.3.4 Any basis {ei}? of V is a mazimal linearly independen, 
set, that is, if S is a subset of V and {e;} is a proper subset of S, then 8 
is linearly dependent. 


Proof Let v € S\ {e;}%_,. By Theorem 2.3.3 {€1,... ,€n,v} is linearly 
dependent. Thus there exist real numbers a; € R, 1 <i Sn+1, at least 
f one of them non-zero such that aje) +++: + Qnen + QngiU = Q, Since 
{v,€1,-..,€n} is linearly dependent and is a subset of S, S is linearly 
dependent by Exercise 2.3.6. 


Q 
Definition 2.3.4 A vector space V is finite dimensional if there exists a 
finite subset S of V such that L(S) = V. That is, V is finite dimensional jf 
and only if there exists a finite set {v;}7_, such that any v € V is a linear 
combination of v;’s. 


Exercise 2.3.12 If a vector space V has a basis with a finite number of 
elements then it is finite dimensional. 


Remark 2.3.3 The converse of Exercise 2.3.12 is true. See Theorem 2.3.8 
and its corollary. 


Theorem 2.3.5 Let V be a vector space. Assume V has a basis consisting 
of n elements. Then any linearly independent set of n vectors in V is q 
basis of V. 


Proof Let {e),... ,én} bea linearly independent set and let v € V. Then 
{v,e1,... ,€n} is a set of n +1 vectors and hence linearly dependent by 
Theorem 2.3.3. By definition, there exist a; € R, not all zero, such that 
Qov + ae; +++: + Anen = 0. If ao = 0, then by assumption there exists a 
k, 1 <k <n, such that a, #0. Since agu + )°"_, aie; = 0 and ap = 0, we 
see that aye; +:+:+Qnen = 0. But then {e;}", is linearly independent so 
that a; = 0 for 1 <i <n. This contradicts our observation that a, 4 0 for 
some 1 <k <n. Therefore we are forced to conclude that ag # 0. Hence 
v = —ap (aye; +++: +Qnén). Moreover, this expression is unique. For, 
if v= or, ose: = Dj) Biei, then D7, (a; — Bie; = 0. Since {e;}*, is 
linearly independent, we have (a; — 8;) = 0 for all i or a; = f; for all t. 
Therefore {e),... ;€n} is a basis of V. 


a) 


Theorem 2.3.6 Let V be a finite dimensional vector space. Then any two 
bases of V have the same number of elements. 


— 
Scanned by CamScanner 


Basis and Dimension of a Vector Space 33 


Proof 1. Let {e1,...,en} and {f,,..., fm} be two bases of V. If m=n 
we have nothing to prove. Suppose m > n. Then m > n+1. Since 
{e,,..-,€n} is a basis of V, by Theorem 2.3.3, {f1,.--, fm} is linearly 
dependent. Therefore there exists aj € R, 1 < i < m, not all zero, such 
that ai fi +:::+Qmfm = 0. But we also have Of; +--+ 0fm =0. Since 
{fi,--- + fm} is a basis, 0 is written uniquely as a linear combination of f;’s. 
Therefore a; = 0 for all 1 < 7 < m, a contradiction. Hence we conclude 
that m <n. Similarly, we prove n < m. Som=n. 


0 


Proof 2. Let {uj}72, and {w;}%, be bases of V. We must show that 
mn. 

Since {v;} is a basis and w; € V there is a unique expression (that is, a 
linear combination) 


m 
i > ait. (2.3.3) 
i=] 
Arguing similarly, we have 
n 
U= >> Birwr. (2.3.4) 
r=1 


Using Equation (2.3.4) in Equation (2.3.3), we get 
m m n n m 
w= SS aya; = >> a (x: pw = » (5 on] wr. (2.3.5) 
i=1 i=1 r=] r=] \i=1 
Similarly using Equation (2.3.3) in Equation (2.3.4), we get 
™m n 
= > (336.20) Uj. (2.3.6) 
j=1 \r=1 


We already have w; = Ow; +++: + Owj-1 +1- w; + Owj41 + +++ + Own. 
Since {w;} is a basis, by uniqueness of the expression, we infer from Equa- 
tion (2.3.5) that 


9 _)9, ifr#j 
2, aiiBi = : fia j. (2.3.7) 
By similar reasoning, we infer from Equation (2.3.6) that 
_j0, iff fi 
Xe a fi if j =i. oe) 


Scanned by CamScanner 


34 Vector Spaces 
a ST ee 


In particular, from Equation (2.3.7), it follows that 0: @jiGi; = 1 for all 
1 <j <n. Summing it over Jj, we get 


> (52 os3i) =n. (2.3.9) 
j=l \t=l 


Similarly. we get from Equation (2.3.8) 


3 (323-24) -_ say 
t=: =! 
But. obviously, the left sides of Equation (2.3.9) and Equation (2.3.10) 


: y 
the same. Hence m= Nn. 


0 


Definition 2.3.5 We say a vector space V is of dimension n if it has a 
basis consisting of n elements. That is, if there exists n vectors €),... ,e, in 
V such that any v € V can be written uniquely as v = yal Qiv;,a; ER 

his is well-defined in view of Theorem 2.3.6. We then write 


t 
. 


Exercise 2.3.13 Determine which of the examples (and exercises) in 
Section 2.1 are finite dimensional and find their dimensions. 


Exercise 2.3.14 Let W be a vector subspace V with dimW = dimV, 
Then W=V. 


Definition 2.3.6 Let V be a finite dimensional vector space over R and 
{e1,...,€n} 2 basis of V. Given any v € V, there exist unique constants 
a; € R such that v = }t_, aie; We daiate zi(v) := a;. The functions 
z;: V — R are called the coordinate functions with respect to the basis 
{e;}%,. ay or z,(v) is called the kth coordinate of v with respect to the 
basis {e;}7,. 


This is the significance of the basis. Given a basis we get a coordinate 
system on V. We thus require n coordinates (no more, no less) to determine 
an arbitrary vector v € V uniquely. 

In other words, a vector space V is n-dimensional if “it requires n coor- 
dinates” to locate the points uniquely. In physics, this is said as “n degrees 
of freedom for a particle v € V to move”. This intuitive way of thinking of 
dimension is quite useful geometrically. We shall put this idea into use in 
the next couple of examples and then give you problems for practice. 


= : - es ——l 


Scanned by CamScanner 


Basis and Dimension of a Vector Space 35 


Example 2.3.5 Let W = {(z,y,z)€R*? | y=z}. We know that W is 
the vector subspace of R°. We show that its dimension is two by exhibiting a 
basis. If w € W, to “locate” w, we need only 2 “coordinates”. For example, 
if we know its z and y coordinates, then we know its third coordinate z. 
Thus dimension of W must be 2. To see this, if (z,y,z) € W then z = y so 
that (x, y,y) = r(1,0,0)+y(0, 1,0)+y(0,0,1) = z(1,0,0)+y(0,1,1). Thus, 
we see that L({(1,0,0), (0,1,1)}) = W . We claim that (1,0,0) and (0,1, 1) 
are linearly independent. This is clear, since neither is a scalar multiple of 
the other. See Exercise 2.3.3. Thus, {(1,0,0),(0,1,1)} is a basis of W so 
that dim(W) = 2. 


Example 2.3.6 Let S,, denote the set of symmetric matrices in M(n,R). 
Then S,, is a vector subspace of M(n,R). What is its dimension? We shall 


work out the case n = 2 and leave the general case as an exercise. 
Let 


In general, to “locate” any matrix 


X= es 2) € Mox2, 


T21 722 


we need four coordinates (211,212,221,222). But, however, if A is sym- 
metric, if we know a;2 then we know a2}. Thus to “locate” A, we need only 
three coordinates a1),@}2,@22. Therefore we expect that the dimension of 
S» is three. To see this, recall the basis {E;;} of M(n,R) where Ej; is a 
matrix whose ijth entry is 1 and the all other entries are zero. If A € Sy, 
then 


A 


@11 Fy) + a32F 2 + a2) Eo, + a22E 22 
@31 E41 + a32(Ei2 + £21) + a22E 22. 


Thus {£11, £12 + E21, E22} generates Sp. It is easily seen that it is linearly 
independent so that it is a basis. 


Exercise 2.3.15 Let W = {(z,...,2¢n) | 21 +22 +-+- +2, =0} CR". 
Find a basis and dimension of W. 


Exercise 2.3.16 If V = R" and W = {(z,... ,2n) | 21 = zp}, finda 
basis and dimension of W. 


Exercise 2.3.17 If V = R" and W = {(z,... ,2n) | z_ = 0 if k is even}, 
find a basis and dimension of W. 


Exercise 2.3.18 Let W = {(z1,...,2n)|z,’s are all equal for k even}. 
Find a basis and dimension of W. 


Scanned by CamScanner 


36 Vec 


—___________________“sctor Spc 


Exercise 2.8.19 Find a basis for S, of symmetric matrices, 


Exercise 2.3.20 Let V = M(n,R) and let A, be the set of al 


] Sk 
symmetric matrices. Find a basis and dimension of A,,. Cw. 


Exercise 2.3.21 Let V = M(n,R) and W = {X €V | trace x _ 
Find a basis and dimension of W. (Recall tr(X) = 0; 24; if X = (ay) 
1j)). 
Over R 
F yy 


Exercise 2.3.22 Find a basis of C, considered as a vector space 
(see Exercise 2.1.21). Hint: Any z € C can be written as z=]. > 
z,yER 


Exercise 2.3.23 Can you exhibit a basis of P, consisting of elemen 


t 
of degree n? All of degree <n-1? S all 
Exercise 2.3.24 The notation is as in Exercise 2.3.20. Given a bas; 
a {vi}%, of V and {wj}%_, of W, find a basis of V9 W. — 
Exercise 2.3.25 The notation is as in Exercise 2.1.19. Given a basis 


{v;}™, of V, find a basis of {+} x V. 


t=1 


' Remark 2.3.4 You should realize that “more the merrier” is not true here! 
f That is, if we adjoin more elements to a basis we may not be able to “locate” 
any vector v “more precisely” than what we could do with the given basis 

For example, if V = R? and e; = (1,0) and e2 = (0,1), then {e1,¢} 
is a basis of V. Let us take S = {e1,€2,€3 = (1,1)}. With respect to 
{e1,€2,e3}, we can write 


v=(1,1) = le; + le: + 06e3 
= - ihe re? 
me ie ae 
= (Oe; + Oe2 + les. 


That is, there are lots of ways of giving coordinates to v as (1,1,0) or 
(3, i, ) or (0,0,1) and a host of others! 
The next definition is the same as Definition 2.2.3 and nearby items, | 


Definition 2.3.7 Let S be a non-empty subset of a vector space V. An. 
element v € V of the form v = )o}_, aiv;, v5 € S, a; € R is called a 
finite linear combination of elements of S. The set of all such finite linear 
combinations is called the linear span of S denoted by Span {5}, L(S) or 
< S> and S is called a set of generators of L(S). 


Example 2.3.7 Every vector space has a set of generators. Since for a 
vector space V, V = L(V), V itself is a set of generators of V. Any basis” 
is a generating set. 


| 


Scanned by CamScanner 


In fact, a basis is a minimal set of generators. Can you make this 
statement precise and prove it? (Hint: See Theorem 2.3.7 below.) 


Remark 2.3.5 Let V = R?. Then, 


Ss; = {(1, 0), (0, 1)}, 
Sp = {(1,0),(0,1),(1,1)}, and 
S3 = {(0, 0), (1,0), (—1 ,0), (0, —1),(1,1)} 


are all generating sets of R?. That is R? is spanned by S, 52 as well as $3. 
A number of questions arise here: Which spaces can be spanned by a finite 
set of elements? Also if a space can be spanned by a finite set of elements 
what is the smallest number of elements required? 


Theorem 2.3.7 Let V be a finite dimensional vector space. Then the 
following are equivalent: 


(1) {e1,--- ,€n} is @ basis of V. 
(2) {e1,..- ,€n} ts a mazimal linearly independent set. 
(3) {e1,...,€n} is @ minimal generating set. 


Proof (1) = (2) Let {e1,...,en} be a basis. We first show that {e;} 
is linearly independent. If }> aye; = 0, then the zero vector is expressed 
as a linear combination of e;. The zero vector already has the expression 

= Oe; +--:+0e,. Hence by the uniqueness of the expression, a; ='6 
for all i. Therefore {e1,...,e@n} is a linearly independent set. We now 
claim that it is maximal linearly independent. That is, if we add even one 
more element z to {e;}7.1, then the resulting set {z,e1,...en} is linearly 
dependent. This follows from Theorem 2.3.3. Or, more directly, since {e;} 
is a basis, we can write z as a linear combination of e;’s: z = }>;_, aie; But 
then aor — a1€] —***— Qnen = O with agp = 1. Thus the set {e),... ,en, 2} 
is linearly dependent. 


(2) = (3) To this end, we need to show two things: 
(a) e;’s generate V, that is, any v is a linear combination of e;’s. 


(b) No proper subset of {e;} has the property (a), that is, we cannot 
generate all of V by any proper subset of {e;}",. 


To prove (a), let v € V be given. Since {e,,... ,e,} is a maximal linearly 
independent set, {v,e1,-.. ,€n} is linearly dependent. Hence there exist 
scalars a;, 1 <i St n with at least one of the a;’s different from zero such 
that aguv + a1e1 +-**+Onen = 0. We want to write v = —ap* Diy aie. 


Scanned by CamScanner 


38 Vector Space 
- ~ — 


——— 


We need to show that ay) #0. If ao = 0, then, aie) ++++ + Qnen = | 
and not all a, = 0,1 <i <n. Therefore {e;}{_, is linearly dependent 
a contradiction. Hence v = Yo 4,e,, with 8, = —ag'a; and we see tha 
{¢1,... .e,} is a generating set for V. This proves (a). 

It is a minimal generating set. For, if 


SBepavs s@ivewa stat = {e),... s€n} \ {ex} 


generates V’, then e; is a linear combination of e;’s, 1 < 7 Fi <n. Bu 
then {e,}", is linearly dependent, a contradiction. 

(3) = (1) Since {e;}/L; is a generating set, any v € V is written as : 
linear combination of e;'s: v = }> aje;. We need only to show that this 
expression is unique. If not, let v = 5° Bje;, be a different expression. Ther 
there exists i such that a; # 6;. Hence 


a Yi(a; — Bj)e; + (a; — Bie; = 0. 
j#i 
'e conclude that e; = —(a; — §;)~! dj 4i(0j — Bj)ej- We now make the 
llowing: | 
Claim: {€1,... ,éj,-+» ,€n} is a generating set of V. 
This claim contradicts the minimality of {e),... , en}. 


i Let w € V be arbitrary. Since {e;}’_, is a generating set, we can writel 


So Hk | 


So Ke + ie: 

k#i | 

= S 1Kek - i (Qi = B;)~ exc as)e) 
k#i i#i 

= hw + lB ax) en - Belen 

ki 


= Serer: 


kAi 


a w 


That is, w is in the span of {e,... ,€i,-.- ,€n}. In other words 


fe ijun pEfyers 1€n} 


1S a generating set. 


Scanned by CamScanner 


Basis and Dimension of a Vector Space 39 
a LLL 


Exercise 2.3.26 Find the dimension of the set of solutions of 
(1) c+4z+t=0,2+y+2z-4t=0. 
(2) c+2y=0,y—-z=0,r+y+z2=0. 


Exercise 2.3.27 In P,, exhibit a basis consisting of elernents each of which 
has degree n. 


Exercise 2.3.28 Find a basis for the vector space M,,,,(R) consisting of 
the set of m x n matrices with real entries. 


Exercise 2.3.29 Show that the following elementary operations on a sub- 


set {v1,---,Uk} of a vector space V “preserve” linear independence or 
dependence of the family: 


(1) Interchanging two of the vectors. 
(2) Multiplying a vector by a non-zero scalar. 
(3) Replacing any v; by vj + av; for any scalar a and any j +i. 


What is expected of you is this: If we have {vj,... ,vj,... ,U;,--- ,Un} and 
use the first elementary operation to get {uj,... ,v;,...,Uj,... ,Un} then 
the first set is linearly dependent (respectively independent) if and only if 
the second is so. Similar remarks apply to others. 


Theorem 2.3.8 Let V be a vector space of dimension n and let W be a 


subspace of V. Any basis {w,,..., wx} of W can be ertended to a basis 
{u1,--- Un} of V such that vj = w; for <i<k. 


Can you think of a special case of W in Theorem 2.3.8? What does the 
theorem say when W = {0}, the trivial subspace? 


Proof W is the subspace spanned by {w,,...,w%}. Hence W C V. If 
W = V, then we are through. If W C V, then there exists uz. € V such 
that vp41 ¢ W, that is, Vk+1 E V \ W. 


Claim: {wy,... ;Wk;Uk+1} is linearly independent. 


For, if not, let ae Q;Wj + Ak+1Ue41 = 0. Then ay; = 0. For, if 
Oe #0, Vee = —OK 4) oh a;w;, and hence it is in W, a contradiction, 
since by our choice v,41 ¢ W. Hence the claim. 

Now if W; = Span {wi,...,Wk,Uk41} = V, then we are through. If 
not, there exists uz42 € V \ Wy and {wy,... , we, Ve+1, Uk42} is linearly 
independent as above. If W2 = Span {wi,... , wx, Us41,Uk42} = V, then 
we are through. Otherwise we continue the above process. Since V is finite 
dimensional this process must end after a finite number of steps. In fact, the 


Scanned by CamScanner 


oo tOF Sg 


Figure 2.3.1 dim(W; + W2) = dim W; + dim W2 - dim(W 9 W.), 


process ends exactly after r := (dim V — dim W)-number of steps. Do you 


ral see why? At the end of r steps, we shall have {Wiy.++ ) Wk, Uk41,... Ute} 
r a linearly independent set of k +r = n elements. By Theorem 2.3.5, thig 
a set is a basis. 

0 
Corollary 2.3.9 Let V be a finite dimensional vector space. Then V has 

a basis. 

f 
/ Theorem 2.3.10 Let V be a vector space and W; and W2 be subspaces of 
y F 4 V. Then 

dim(W; + W2) = dim W, + dim W2 — dim(W, N W2). (2.3.11) 


The geometric meaning underlying this theorem: 


If V = R’, and W, and W are, say, one-dimensional vector subspaces 
of V, then they are lines through the origin. 

If W, # Wo, then W, + W will be the plane containing both the lines 
and W; N W2 = (0) so that the equality, 


2 = dim(W, + W2) = (dim Wy = 1) + (dim W2 = 1) — (dim(W1 N W2) = 0), 


holds. 
If W = Wo, then by Exercise 2.2.21, Wi+W2 = Wi, and WNW, = Wy, 
so that 


l= dim(W, T W2) = (dim W, = 1) + (dim W, = 1) = (dim W, ‘a W2 = 1). 


We can do a similar analysis, when W is a line and W» is a plane. Here 
there are two-cases: W; C W2 or Wi NW = {0} (see Figure 2.3.1). 


Scanned by CamScanner 


Basis and Dimension of a Vector Space 41 
——_ OOO 


In the first case, W; + W2 = W2 so that 
9 = dim(W + We) = (dim W, = 1) + (dim W2 = 2) — (dim(W N W2) = 1). 
In the second case, W; + W2 = R? so that 
3 = dim(W1 + W2) = (dim W = 1) + (dim Wp = 2) - (dim(W, NW») = 0). 


We leave it to the reader the case when W, and W, are both planes 
through the origin. Here again there are two cases: W, and W intersect 
in a line or coincide. Is it possible that W; M W2 = {0}? 


Proof WiNW2 C W; fori = 1,2. Let {uj}#,, be a basis of WM Wo. 
Extend this to a basis 

{iiy5 033 yUky V1 yee 5 Une} of W, 
and a basis 

{u1,- ++ Uk, W),... Wn} of Wo. 


Then 
dim W, + dim W2 — dim(W, N W2) = (m+k)+(n+k)-—k=min+k. 


Claim: {u1, vee yUky Uly+++ Um; W),--- Wn} is a basis of Wy +W. 

Let zr € Wi + Wo. Then Z = W + We, where w; € W, and we € W. 
Now wi = dint ajui + 05_) 6;0;- Similarly, w2 = 7_, quit De, bjw;. 
Hence z = © ;-, (i +7) Ui pe Bjv;3 + ae 6;w;. This shows that the 


set {u1,.-- »UkyU1)-++ Um) W1y-+- » Wn} Spans W,+W 2. We now show that 
it is linearly independent. 


Assume that >> ajuj + >> Bj; + 7,-w, = 0. We need to show that 
a;,8;,7r’s are all zero. Now 


k m n 
dais +) B95 = — Dawe 
i=l =] 


j=l 


The expression on the right side is an element of W. and that on the left 
lies in W,. Thus — )°?_, yrwr € WM We and hence we can write 


n k 
r=] j=l 
so that 


k n 
So aju; + wae = 0. 
j=l r=l 


~ Scanned by CamScanner 


Vector Spac 


ea 


. : V7 : . ; 
eee wr} is a basis of We and hence is a linearly inde. 
set. We therefore conclude that aj = 0 and 7, = 0 for al] j ang 


articular, 
k ™m n 
) Quy + ) B50; = ) YrWr = 0. 
i=1 j=l r=] 


ye (uiy-+s Uk is Um} is a basis of W, it is linearly independent 
therefore conclude that aj = 0 and 6; = 0. Thus aj, 6; and ‘>, =0 fo, 
i, j,T. This completes the proof. 


Q 


The rest of the section may be omitted in the first reading, 


Definition 2.3.8 Let Wi,...,Wx be subspaces of a vector space vy, Let 
W =W,+---+ We. Then we say that W is the direct sum of the W; if 
for each j, 1 7 Sk, WIN(Wi + vee Wye + Waite) + W,,)= {0} 

We write W = W, 6 -:: S Wk. j 
Remark 2.3.6 As we saw above when k = 2, direct sum implies that 
Wi NW? = {0}. However, when k > 2, to say W=W8---OW, implies 

much more than W,M:--- AW; = {0}. The intersection of each W; with 
the sum of the other W;,’s has to be the zero vector. 


Exercise 2.3.30 Show that if Wij, i = 1,2 are subspaces of V with 
dim W, + dim W2 > dim V, 


then W110 W2 # {0}. What can you say if we always have 
dim W, + dim W2 = dim V? 


Exercise 2.3.31 Show that M(n,R) = Sn @ An. (Notation as in Exer- 
cise 2.1.13 and Exercise 2.1.14). Hint: Given X € M(n,R), what can you 
say about (xX ai Xty/2 and (Xx = PGT is X* stands for the transpose of xX 

whose (i, j)th element is the (j,i)th element of X. 


Exercise 2.3.32 Let S be the vector space of sequences. Let W be the 
vector subspace of constant sequences and NV be the vector subspace of null 
sequences. Show that S = W ON. 


Exercise 2.3.33 Let V := F(X,R) be the vector space of real valued 
functions on X. Fix ro € X. Let W be the vector subspace of functions 
vanishing at zo: W := {f ¢ F(X,R) | f(eo) = 0}. Is there a vector 
subspace W’ such that V = W GW’? 


| 


Scanned by CamScanner 


3g, Lines and Quotient Spaces 


This chapter has three sections. The first section is very important. The 
second and third sections may be learnt during a second reading of the 
book. 


3.1 Definition of a Line 


We have already seen that the one-dimensional vector subspaces in R? and 
R? are lines passing through the origin. Recall that any one-dimensional 
vector subspace is of the form Rv, for some nonzero v and conversely. 
Therefore, if a line ¢ in R? is given, it must be parallel to one of the lines 
passing through the origin, that is, to Rd for some nonzero d € R?. From 
our study in Chapter 1, to get a line ¢ Cc R? (or R°), we know that we must 
consider a set of the form p+ Rd where p is any point on the given line @. 
This motivates our definition of a line in any arbitrary vector space V. 


Definition 3.1.1 Let V be a vector space and 0 # dé V. A line passing 
through p € V and having direction d, is denoted by é(p;d) and defined as 


l(p;d)={veEV | there existst€R,v=p+td}=p+Rd. 
In R", if c = (z),... ,2n) € &(p;d), then, there exists t € R such that 


(Fi yonaySn) = (Payers Bn) PU. dq) foOrteR 
(p: +td),... ,Pn + tdn). 


Hence z; = pi + td; so that t = = for 1 <i <n. Eliminating t, we get 


Se ee ee a 
dy dz d, 


(This procedure is called the elimination of t.) When n = 3, dj, dz, dg are 
known as direction cosines of the line. This is why d is called the direction 


43 


Scanned by CamScanner 


44 Lines and Quotient Space 
sh Hines ane Quotiont Spey 


vector for the line é(p;d). When n = 2, the above equations are given by 


r-p~fi _ Y—P2 


dy 2 


Hence d2(z - pi) = di(y— pa) oF 
dy 
4.2 eM t PAH Mz tC, 
1 


where m = dz/d; is the slope. 
We encourage the reader to contemplate on the geometric meaning of 
the propositions in this section before starting on their proofs. 


Proposition 3.1.1 é(p;d) = €(g;d) if and only if (q — p) is a multiple of 
d. 


Proof Suppose f(p;d) = &(g;d). Then z € &(p;d) implies x ¢€ £(4; ) 

Then there exist s,f € Rsuch that s = p+sd = q+td. Hence q-p = (s—1)q 

that is, g — p is a multiple of d. ) 

Conversely, suppose g — p is a multiple of d, that is, g— p = ad for Somme 

aéR Let ve Up;d). Then, v=pt+td= (q-ad) + td= q+ (t~qy 

f for some t € R. Therefore, v € ¢(g;d) and so £(p;d) C £(q;d). Similarly 
£(q;d) C &(p;d) and hence £(p; d) = ¢(g; d). 


0 
The proofs of the next two propositions are left as exercises. 
Proposition 3.1.2 ¢(p;d) = ¢(p;ad) for any a € R \ {0}. 
0 
Proposition 3.1.3 £(p;d) = (q;d) for any q € £(p; d). 
0 


Proposition 3.1.4 Any two distinct points determine a unique line. 


Proof Let ¢ be any line such that p,q € 4. If d is the direction of t 
then ¢ = &(p;d) = ¢(q;d) by Proposition 3.1.3 above. By Proposition 3.1.1 
p—q=td for somet €R, t #0 since p# q. Therefore, 


£ = E(p; d) = €(g;d) = C(p; td) = &(p;p — q) = L(g; q —p). 


Thus the only line having p and q on it is £(p;p — q) (which is the same as 
£(q;p - 9). | 


1 


| 


a 


‘ 


| 
Scanned by CamScanner 


Definition of aLine . 


Notation. Forz,y€V,z # y, we let &(z, y) denote the unique line joining 
z and y. Note that from the proof of Proposition 3.1.4 it follows that 
(x,y) = &(z;y — x). Thus &(z,y) = {z+t(y-z) | te R}. Inv =P? 
and p = (11,41, 21) and g = (22, y2,z2) € R’. If r = (z,y,z) € p,q), then 
r=p+tq forsomet€ RK. Equating the components and eliminating t as 
earlier, we get 

ony UU... 7a 


~ 1 
Ja-% Ya-yr 4-Y% 


the standard equation of a line in R*, joining the points (z,,y,,z;) for 
4=1;2. 


Definition 3.1.2 We say that the two lines £(p; d;) and £(q; dz) are parallel 
if d) = ad2 for some aE R. (a #0, as dj #0). 


Exercise 3.1.1 In R?, two distinct lines £ and @ are parallel if and only if 
¢né' =9. Is an analogous statement true in any R”? 


Proposition 3.1.5 (Euclid’s parallel postulate) Given £ and q ¢ ¢ 
there exists a unique line €(q;d) such that €(q;d) is parallel to €. 


Proof Let d be a direction of £. Then &(q;d) is a line passing through q 
and parallel to 2. Suppose there exists another line €(g; d)) such that ¢(q; d;) 
is parallel to £. Then d; = ad for some a € R. Hence by Proposition 3.1.2, 
we find that (gq; d) = £(q; d1). 

Oo 


Thus our definition of a line in R? has the following properties: 
(1) Any two distinct points determine a unique line. 


(2) Given a line é and a point p not on £, there exists a unique line ¢(p; d) 
parallel to é@. 


Therefore R? is a model of plane geometry in which the Euclidean parallel 
postulate is a theorem and not a postulate! 


Exercise 3.1.2 Can you define parallel planes in R® using the concepts 
defined so far? Can you generalize your definitions? 


Scanned by CamScanner 


46 Lines and Quotient Spa 
t 


3.2 Affine Spaces 


Let V be an arbitrary vector space. 


Definition $.2.1 An affine subspace is a non-empty set such that f 

z,y € S, the line joining x and y also lies in S. This means that jf z Or al 
then tr + (1-t)y € S for allt € R. Note that S need not be nves 
subspace. Note also that ¢ € R is arbitrary. Veet 


Example 3.2.1 Let W be a vector subspace of V. Let yu ey be fx 

Then the set S := v+W is an affine space. For, let z,y € S be arbi td 
Then z=v+w, and y=v+w» for some w; € W. The line joining eat 
y is the set é(z,y) := {tr+(1-t)y | t€R}. Let z € &(z,y). We ‘* 
to show that z € 9. For this, we must show that z is of the form y+ ei 
some w € W. Since z € é(z,y), 2 =t2+(1—-t)y forsometeR We re 


z= t(v+ wy) +(1-t)(u+ wy) = (t+1-t)ut tu, +(1-thw =y4y 


where w := tw; + (1—-t)w. Since W is a vector subspace we know that 
w € W. Thus z is of the form v + w for some w € W and hence z€§ 


The converse is also true and that is the content of the next theorem 


Theorem 3.2.1 A non-empty subset S of V is an affine space if and only 
if it is of the form v+W for some v € V and a vector subspace W of V, 


Proof One way is Example 3.2.1. To prove the other way, we work back. 
wards. If the result is true and S = v + W, note that v € S as we ca 
writev=v+0euv+W. Then, S=v+W implies that W = S —v, This 
suggests the following approach. 

Fix v € S. Consider W':= S—v. We wish to show that W is a vector 
subspace. Clearly,0€W,as0=v-veS—v. If uw; € W, we have to 
show that w; +w2 € W. We can write w;) = z-v and w2 = y—v for some 
x,y € S. Now, w; tu, = (r-v)+(y-v) =2+y+(-1)v—v. This wil 
be in S, if we can show thatr+y-—v€S for z,y,v € S. (Note that the 
coefficients add up to 1. This follows from the lemma below). Assuming 
this for a moment, we have shown that w; + w2 = z — v for some z € S, 
that is, it lies in W. 

To show that ifwé€ W andaeéR, thenaw € W. Let w = 2-0 for 
z€S. Then 


aw =ar-av=ar-(a-l)jv-v=ar+(l—a)v—v. 


Since S is affine and z,v € S we see that ar + (1—a)v € S. Thus the 
displayed equation shows that aw = z — v where 


z=art+(l-a)jve S. 


Hence aw € W. We have therefore shown that W is a vector subspace. 


Scanned by CamScanner 


Quotient Space 


Lemma 3.2.2 S is an affine subspace if and only if S\7_, aiv; € S for all 
vy, € S anda; ER with aie 1a 


Proof Note that for n = 2, this is just definition. We proceed by 
induction. Assume the result for n. Let 4 € S,a; €Rforl<i¢gn+]1 
be arbitrary with 5°; a; = 1. There exists at least one i such that a; # 1. 
Without loss of generality, we assume that an4; # 1. Let us denote an+1 
by a. Now, nH % = 1-a@ #0. Hence if we let 6; := —) then 
57.1 i = 1. By induction hypothesis, = Bis € S. Call it w. Hence 
(1- B)w + Bunii € S. But this is nothing other than Bees, Q4Uj. 

0 


Now if you go back to Chapter 1 on systems of linear equations, you will 
realize the following: 

Let S be the set of solutions of a (possibly) non-homogeneous system 
(1.2.1). Assume S # 0. Then S is an affine space and S = a + S, for 
some (and hence any) a in S. Here S), stands for the set of solutions of the 
associated homogeneous system. 


Definition 3.2.2 The dimension of an affine space S is dim W if S = v+W 
for some v € S. 
An affine space of dimension dim V — 1 is called a hyperplane. 


Exercise 3.2.1 A hyperplane in R? is a line. A hyperplane in R® is a 
plane. 


3.3. Quotient Space 


Definition 3.3.1 Let W be a vector subspace of a vector space V over R. 
By a coset of W in V we mean a set of the form v+W := {u+w|we Ww} 
for some vu € V. 


We look at a couple of examples to get a feeling of this concept. 
Example 3.3.1 Let V = R? and W = Re, = {(z,0) | z € R}. Take any 
v = (a,b) € R?. Then we have 

v+W = {(a,b)+(z,0)|c€R} 
= {(a+z2,b)|reER} 
{(2’,b) | z' € R}. 
That is, v + W is the line ¢ through (a,b) parallel to the z-axis (see 


Figure 3.3.1). In the notation of the last section € = €((a,b);e,). Note 
that if (a’,b') € @, then (a’, b’) + W = (a,b) + W. (Check!) 


Scanned by CamScanner 


| 


48 Lines and Quotie 
nt Space, 
(a,b) 
vu+Ww 
v 
W 
0 


Figure 3.3.1 Coset of W. 


Example 3.3.2 We shall be brief here. Let V = R° and 
W = {(z,y,0) | z,y € R} 


be the zy-plane. Then the coset v + W for any v € R° is the plane Paralle| 
to the zy-plane through the point v = (a,b,c) at “height” c. 

Just to make sure that you understand these two examples, work oyt 
the following exercise: 
Exercise 3.3.1 Let W := { (z,y) e R? | az + by = 0} for a fixed 
(a,b) € R? \ (0,0). Show that W is a one-dimensional subspace of V = R? 
and that the cosets of W in V are the lines parallel to the line az + by =0 
and hence are given by ar + by =c force R. 


We now want to talk of the set of cosets of W in V. In Example 3.3.1, 
the set of cosets are the lines parallel to the z-axis. We denote the set of 
cosets of W in V by V/W (read as V mod W). 

We want to define “addition” of two elements of V/W. Let us look 

“xample 3.3.1 to see how to go about doing this. Any “point” ¢ 
ounced ‘xi’) of V/W’ is a line given as, say, y = a. Thus we may 

.e the “addition” of two such “points” given by y = a and y = b as the 
point” given by y = a+b (see Figure 3.3.2). | 

Note that the point £, (respectively &)) of V/W given by y = a (respec- 
tively by y = 6) is the coset (0,a)+W (respectively by (0,6) +W’). So what 
we have done is to define €; + = € where € is the coset (0,a + b) + W. 
That is, 

((0,a) + W) + ((0,6) + W) := (0,a +6) + W. 
We can also multiply the points of V/W by scalars (that is, real numbers) 


as follows: | 
a((0,a) + W) := (0,aa) + W. 


Scanned by CamScanner 


—— 


quotient Space gg 


v2 +W 


v; +02 +W 


Figure 3.3.2 Addition of cosets. 


We invite the reader to show that V/W with respect to these two operations 
becomes a vector space over R. What is its dimension? 

After this special but illuminating case we wish to do a similar thing 
for V/W for any vector space V and any vector subspace W of V. Before 
doing this we observe a crucial fact about the cosets. 


Lemma 3.3.1 Let W be a vector subspace of a vector space V over R. Let 
uj + W be cosets fori = 1,2. Then exactly one of the following is true: 


(1) (v1 +W)N (v2 + W) = 0. 
(2) 1 +W=u+W. 
Moreover, v; + W = v2 + W if and only if v; — v2 € W. 


Proof This is geometrically clear (see Figure 3.3.3). For v, +W and 
vo + W are “parallel subspaces” and hence either they coincide or they are 
disjoint. 

We prove this algebraically. Let v; + W and v2 + W have non-empty 
intersection. Then there exists v € V such that v € u;+W. Since v € v;+W 
there exists w; € W such that v = v; + w;. But then vy, — v2 = w2 — wy. 
Since W is a vector space, w; — w2 € W. Therefore v,; — v2 € W. (This 
proves the last statement of the lemma.) Let u = v; — v2 € W. 

Let zc € v;} + W. Then xz = v, + w for some w € W. Since 


Vi = (v, — v2) + v2 
and u = v; — vo € W we see that gc = u+ v2 +w or z = v2 + w’ where 


w’ =u+wéW. Thus z € v2 + W. Since x was an arbitrary element of 
v1 + W, we have thus proved that v; + W C v2 + W. Interchanging v, and 


Scanned by CamScanner 


50 Lines and Quotj ent g, 
Ag, 


v2 we see that vu. + W Cv, + W. Thus we have Proved that if t} ‘ 
v, + W and v; + W have non-empty intersection, then they are thes Se 
amy 


| 
Do you see that the last statement of the lemma is the algebraic 


of Proposition 3.1.1? Think about this before going further, You 
also want to go back and compare the proof of this assertion with ¢ 


h 
Proposition 3.1.1. at ( 


v7 v2+W 


vy +W 


Vy — U2 


—v2 
Figure 3.3.3 Illustration for Lemma 3.3.1. 


Definition 3.3.2 Let £ € V/W bea coset. If € =v + W, then v is calle 
a representative of €. 


The representative is by no means unique. For instance, in Exam 
ple 3.3.1, for the coset y = a, any point on the line (that is, any point of the 
form (z,a)) is a representative. What we saw in the proof of Lemma 3,3 
is that if € = v, + W as well as € =v+W, then v—v, is an element of W, 
Thus, any two representatives of a coset of W differ by an element 
of W. In particular, z+ W = W if and only if zc €¢ W. (But, this isa 
solution of Exercise 2.2.21!) 

We now show how to define “addition” of two cosets €; := v; + W in 
V/W. Let v; be a representative of €; fori = 1,2. Then we define €; +& to 
be the coset whose representative is vy) +v2. That is, £)+& = (vy +v2)+W. 

We need to show that this coset €, + £2 is defined without any ambiguity. 
This is usually called “the well-definedness” of the concept. Can you 
identify the possible source of confusion in the definition of €, + &? For, 


Scanned by CamScanner 


ROK 


Quotient’ Space| * 


I may choose a representative vu; for &;, for i = 1,2. You may choose u; 

to be a representative of €;. According to me the sum €, + & is the coset 

(v; + v2) + W whereas according to you the sum is (uy + uz) + W. 
Which of us is right? Well, we both are! That is, I claim that 


(v1 + v2) + W = (u) + u2) + W. 


The equality holds if and only if (v; + v2) — (u; + u2) € W. That is, if 
and only if (vu; — ui) + (v2 — uz) € W. But this is true, since v; and u; 
are representatives of the same coset and hence v; — u; € W. Since W is 
a vector subspace, (v; — uy) + (vo — uz) € W. Hence the claim is proved. 
Thus, to define the sum of two cosets we may use any two representatives. 

Now I am sure you know how to define scalar multiplication on V/W. 
Did you get the following definition? Ifa € Rand € =v+W € V/W, 
then a(£) := av+W. As earlier, we may show that this is well-defined. 
Ifé =u+W, then av+ W = au+ W as av—-au=a(v—u) € W since 
v-ueéW. (Why is v-ue W?) 

Thus we have defined addition and scalar multiplication on the set V/W. 
We claim that V/W with these two operations is a vector space over R. We 
state this as a theorem. 


Theorem 3.3.2 Let W be a vector subspace of V. Let V/W denote the set 
of cosets of V with respect to W. The following operations are well-defined: 


(1) €1 + &2 = (v1 + v2) + W where & := 15 +W € V/W. 
(2) a€ = (av) +W where E=v+WeV/W. 
V/W with these operations becomes a vector space. 


Proof That the operations are well-defined is proved above. Suppose we. 
want to prove the associativity of the addition on V/W: Let 


&=u+w,1<i<3 
be arbitrary. Then we need to show that 
(6: + 2) + 3 = &1 + (2 + &). 
That is, to show that 
((v; + v2) + W) + (ug + W) = (uv, + W) + (v2 +03 + W). 


The left side is ((v, +v2)+v3]+W, while the right side is [v, +(v2+v3)]+W. 
But by the associativity of addition in V, we know that 


(v) + v2) + v3 = v1 + (v2 + v3) 


so that both the cosets [(v; + v2) + v3] + W and [v; + (v2 + v3)] + W are 
equal to (v, + v2 + v3) + W. 


( a a 


Scanned by CamScanner 


i | 


What is the zero element of V/1V? 
The rest of the proof goes along the same lines. The burden of sh 


St deen ; : Ow} 
something in V/W relies on th. observation that its analogue js true in te 


Q 
V/W is called the quotient space of V with respect to W. What jg it 


5 


dimension? 


Theorem 3.3.3 Let V be a finite dimensional vector space and W v 

subspace of V. Then dim V/W =dimV -dimW. ie 

Proof Let {x:,....w.} be a basis of W. Extend this to a basis 
fii, <43 Win Wigs ss Mat 


of V. We show that {v;+ W}2, is a basis of V/W. Lettz+We vw 
Since 2 €V, 2 = Y=, iwi + D5., 80; for some a;’s and 8;’s. Hence 


So ai; + >. Bjv; +W 
: i 


>> Bit iW by Exercise 2.2.21 
j=1 


> Biv; + Wi. 


j=l 


z+W 


" Therefore every element of V/W can be expressed as a linear combination 
of {u; +W,...,u,+W}. We prove that {u. + W,... ,un + W} is linearly 
independent. Consider a;(v,; + W) +---+a,(v, + W) as the zero element 

V/W for a; €R Then 


Le S> ai(v; +W)= (Sains) +W=W. 


It follows that 7, ain; € W. Since {w,;}", is a basis of W, we can 
write Di, aity = D5, Bjw, and thus )°i_, ain; — D, Byw; = 0. Since 
{wy,..- ,Wm,U1,--- 5 Un} is a basis of V, we deduce that a; = 0 for all i and 
6; = 0 for all j. Thus 5°; ai(uj + W) = 0 in V/W implies that a; = 0 for 
all i. Hence {vj +W}f., is linearly independent and we have dim V/W =n 
and dimV -dimW =m+n-m=n. 


q 


Scanned by CamScanner 


4, Linear Transformations 


If X and Y are any two arbitrary sets, there is no obvious restriction on the 
kind of maps between X and Y, except that it is one-one or onto. However 
if X and Y have some additional structure, we wish to consider those maps 
which in some sense “preserve” the extra structure on the sets X and Y. 


4.1 Linear Transformation 


Informally, a “linear transformation” preserves algebraic operations. The 
sum of two vectors is mapped to the sum of their images and the scalar 
multiple of a vector is mapped to the same scalar multiple of its image. 
More precisely, we have the following definition: 


Definition 4.1.1 A linear transformation (or a linear map) T from a 
vector space V to a vector space W is a map which satisfies: 


(i) T(0) =0. 
(ii) T(vy + v2) = T(v1) + T (v2), for v1, v2 € V. 
(iii) T(av) =aT(v),ae€RandveV. 
Remark 4.1.1 Condition (i) is not needed (see Proposition 4.1.1). 
Remark 4.1.2 This remark may be omitted in the first reading. 


Geometrically, a linear map sends lines passing through the origin to 
lines passing through the origin or onto the origin. 


Let z € €(z,y), z,y € V. Then we can write 
z=rt+t(y—z)=(1-t)r+ty 
for a unique t € R. We say that z divides the line segment 
[z,y] = {z+s(y—z) | O<s<]} 


53 


Scanned by CamScanner 


34 Linear Transformation, 


in the ratio (1 -#) : 4 Note that the sense of direction is important here 
For, if we write : = y+ s(r—y) then s = (1 - ¢) so that = divides the line 
segment [y, 2! in the ratio (1 - s) : s, that is, in the ratio f: (1 —t). No. 

to our earlier notation, if v lies on the line joining v; and wy an: 
divides it in the ratio f: 1 -é, then T(r) also lies on the line joining Tle, 
and T(r) and divides it in the same ratio. This statement is not Quite 
true, since T(r,) and T(r) could be equal so that there is no unique line 
joining them. However, with a proper convention — in such a case the line 
is to be taken as the point T(v;) — the observation remains correct, 


Let us look at some examples of linear transformations. The Verifications 
are left to the reader. 


Example 4.1.1 The map T: V - W defined by Tv = 0 for all ve y is 
a linear transformation. 


Example 4.1.2 The simplest kind of linear transformation T: V 4 y | 
the identity map J where J(v) = v for v € V. More generally, if a € R 
then T = a! defined by aJ(v) = av is linear. 


Example 4.1.3 Let V = R™ and W =R" with m <n. Consider the map 
T:V —-W given by 
TG cx a) = (Zi35<: SCax O35 ot ,0), (n —-m zeroes). 


Then T is a one-one linear map called the natural inclusion of R™ into R" 


nr 
te 


Example 4.1.4 Let V = R™ and W = R" but now assume that m > n, 
Consider T : V + W defined by T(z),... ,2m) = (21,... 2p). That is 
we drop the last m —n coordinates of the vector from R™. (T is called the 
natural projection of R™ onto R”. 


Example 4.1.5 Let V = R" = W. Fix scalarsa; € R, 1 <i<n, Le 
Tz := (@)2},... ,@nZn)- Then T is a linear transformation. 


Example 4.1.6 Let V =? =I. Consider Re: V + W given by 


Re t\\ _/ cos@z+sin@y 
y))]° \-sindrtcosby) 
Then Rg is a linear map. It is the rotation by an angle @ from the positive 


z-axis. See Example 6 in Section 4.6. 


Example 4.1.7 Consider C asa vector space over R (as in Exercise 2.1.21). 
Consider the map T: C - C given by z + iz, the multiplication by i. Tis 
linear map. 


Scanned by CamScanner 


Linear Transformation - 


Exercise 4.1.1 The conjugation map from C to itself given by z 4 = is 
linear. 


Exercise 4.1.2 Fix uv € V. Consider the map T,: V - V given by 
Tr=z+v. 


Then Ty is a linear map if and only if v = 0. (T, is called the translation 
of V by v-) 

Exercise 4.1.3 Let T: R + R be given by T(z) = z?. Then T is not 
linear. 

Exercise 4.1.4 Let T: R? + R® be given by T(z, y) = (z,y, zy). Then T 
is not linear. 

Exercise 4.1.5 Let T: R? + R? be given by T(z,y) = (z,y +3). Is T 
linear? 

Exercise 4.1.6 Fix A € M(n,R). Consider the map 


La: M(n,R) 3 M(n,R) 


defined by L4(X) = AX, the product of two matrices. Then L is a linear 
map. Is the square map Xw xX? linear? 


Exercise 4.1.7 The map T of Example 2.1.3 is a bijective linear map. 


Proposition 4.1.1 Let T: V — W be a linear map. Then the following 
are true: 


(1) (0) =0. That is, T maps the zero element of V to that of W. 


(2) T(-v) = —T(v) forv € V, that is, T maps the negative of an element 
of V to the negative of the image. 


(3) T(v) — v2) = T(v1) — T(v2). 


Proof Just for the purpose of this proof, we shall let Oy (respectively Ow) 
denote the zero element of V (respectively W). Since Oy = Oy + Oy, we 


have 
T(0v) = T(0y + Oy). 
But, 
Ov +0y =1-0y+1-0y =2:-0y 
so that 


T(0v) = T(2+0y) =2-T(Oy). 
Hence 2: T(0v) — T(Ov) = Ow or T(Ov) = Ow. This proves (1). 


a 
Scanned by CamScanner 


56 Linear Transformat; 3 
h 


To prove (2), T(-2) = T((-1)-2) = (-1)- T(z) = -T(2) where 
have used (4) of Theorem 2.1.1 twice. W 

You should now prove (3) just to show that you have understood th 
above arguments. 


C 


Definition 4.1.2 If V and W are two vector spaces, we denote the set 
all linear maps from V to W by L(V,W). Q 
L(V.R) is usually denoted by V* and called the dual of V. 
Any linear transformation T : V - V is called an endomorphism of V 
We let End (V) = L(V, V). 


The following proposition states that L(V, W) is a vector space. 


Proposition 4.1.2 Let S,T ¢ L(V,W) anda e€R ThenS+T ang as 
defined by (S+T)(z) = Sr+Tz and (aS)z = aSz are again linear mq 
With these operations L(V,W) is a vector space. Ps. 


Proof A routine verification and hence left to the reader. 
0 


Definition 4.1.3 Any linear map T: V - R is called a linear functiong] 
or linear form. It is customary to denote the linear functionals by letters 
f. 9, etc. The set of linear forms on a given vector space is denoted by Vs 
and is called the dual of V. 


Example 4.1.8 Show that the f defined at the end of Section 1.1 is a 
linear functional. (What is the vector space? The underlying field?) 


Example 4.1.9 Let f: R”™ - R be given by f(z1,... , Zn) = 2, fora fixed 
i. Then f; is a linear form. f;’s are also called coordinate functions. Can 
you generalize this? 


Exercise 4.1.8 Let {v;}/_, bea basis of V. Define fj: V > R by f,(v) =a; 
ifv= Set ay.v,. Then f; is a linear form. That is, the coordinate func- 
tions with respect to a basis are linear forms. | 


Example 4.1.10 Let us find all linear maps f : R + R. If we fix any 
nonzero vector, say, Zo € R, then any z € R can be written as z = azo. If 
f (zo) = yo, then f(z) = f(azo) = af (zo) = ayo. In particular, if zo = 1, 
then f(z) = f(1)z. Thus all linear maps from R to R are of the form 
f(z) = az for a fixed aE R. 


Exercise 4.1.9 Let V be a one-dimensional vector space. Find all linear 
forms on V. 


Scanned by CamScanner 


Linear Transformation 57 


Example 4.1.11 We extend the reasoning above to find all linear forms 
f:R" 3 R. Fix the standard basis {e;,... ,¢,} of R®. Let f be a linear 
form and let a; := f(e;). Then 


f(z) =f (>: se a > f(ziei) = ERIC) = Ss | 


i=l i=] i=l tnt 


Conversely, if (a1,--. , Qn) is any n-tuple and if we define f(z) = D7, ziai, 
then f is a linear form. Is there any special reason for us to employ the 
standard basis in the above argument? 


> 
fi 


Exercise 4.1.10 Let fj: R” — R be arbitrary functions. Let T: R™ — R® 
be defined by T(z1,--- ,2m) = (fi(z),--- ,fn(z)). When is T linear? Hint: 
Review Exercise 4.1.3 through Exercise 4.1.5. 


Exercise 4.1.11 Let V be an n-dimensional vector space. Find all linear 
forms on it. 


If the reader goes through the last two examples and exercises, he will 
understand the most important fact: 


Any linear transformation T : V > W is completely 


determined by its action on a basis of V. 


This is the content of the next theorem. 


Theorem 4.1.3 Let V and W be vector spaces. Let {v,,... , vn} be a basis 
of V. Let wi, 1 Si <n be any set of (not necessarily distinct) vectors in 
W. Then there is a unique linear map T: V + W such that T(v;) = uj. 


Proof Let véV. Then, v= ))j", aiv;. Define T: V3 W by 
T(v) = Saw}. 


We have v; = Ov, + Ove +--+ + 1uj + Ovj4, +--+ +00,. Hence 
T (vj) = Ow, + Ow, + +++ + lu; + Owis +--+ + Ow). 


Therefore T(v;) = wi- 


Claim: T is a linear map. 


Scanned by CamScanner 


58 Linear Transformat; 
q 
Let u,v € V. Then v= Dey MiMi and u= Sj) Biri. Therefore, 
T(v) = y ait, T(u) = y > Biw; 
i=l i=l 


By Exercise 2.3.11, v+u= Li(ai + B;)v;. Therefore 


T(v+u) = r(210+8 


n 


Y (ai + Bi)wi 


j=l 


n n 
YY aw + 9 Biwi 
i=l i=] 


T(v)+T(u). 


Let a ER, then av = a(S", aiv;) = Ds (aai)v;. Therefore 


T(av) = Yani) = rai = aT(v). 


1=1 


Hence T is a linear map. 

To prove uniqueness part of the theorem, let T’ be a linear map from 
to W such that T’(v;) = w;. Then we claim T = T’. 

Let v€ V. Then v= )\;_, ajv;. Therefore 


T'(v) =T’ [Se = Yai" (ui) = YS aii =T(v). 
isl i=l i=] 
Since v € V was arbitrary, 7’ =T. Thus T is unique. 


From Proposition 4.1.2, we know that the set L(V, W) of linear ma 
from V to W is a vector space. What is its dimension? 


Theorem 4.1.4 Let V and W be vector spaces. Then the dimension 
L(V,W) is dimV x dim W. 


Proof This is an application of Theorem 4.1.3. We shall sketch the pra 
leaving the details to the reader. Another proof can be obtained using t. 
results of Section 4.4 (see Exercise 4.4.8). 


————— eee 
Scanned by CamScanner 


Linear Transformation 7 


Fix a basis {v;}7, of V and {wj}$_, of W. Define Ti; € L(V,W) by 
setting 
0 ifu,xy 
w; ifv, =; 


Ti;(v,) = 


for 1 < i < mand 1 <j <n. (Such a T;; exists and is unique by 
Theorem 4.1.3.) The theorem follows from the claims: 


(1) {Zij;} generate L(V,W). 
(2) {Tj;} is linearly independent. 


To prove claim (1), Theorem 4.1.3 is again invoked. If T € L(V,W) and 
T(vi) = yi € W, express y; in terms of w;’s. Now you are on your own. 


— 


UL 


Mxercise 4.1.12 Let A: V > W and B: W > U bea linear map. Then 
BoA: V —U isa linear map. 


Exercise 4.1.13 (Dilations) Fix a € R-and let T, : V + V be given 
by Ta(v) = av. if a = 0, then T.(v) = 0 for all v € VY. i = 1, then | 
T,(v) = v, that is, T, is the identity map. | 


a : Pn — Pn be defined by ; | 


d > = 
i aX) ==) ka, X*-!, 
dX & , 1B ak 


Then < is a linear map of ?,, into the vector subspace P,,_1. 


To solve the next three exercises, you need results from analysis. 


Exercise 4.1.14 Let — 


aa 4.1.15 Let S,T: C[0,1]  R be defined by S(f) := f(to), and 
= fo f( t) dt where to € [0,1]. Then S and T are linear. 


Exercise 4.1.16 Let C be the set of all convergent real sequences and let 
T:C 3 & be defined by (z,,) + limz,,. Then T is a linear transformation. 


Exercise 4.1.17 Let D[0,1] be the set of all continuously differentiable 
functions on [0, 1] and let T : D[O,1] + C[0,1] by fH f’. Then T is linear. 


Exercise 4.1.18 Let V = R" and W be the subspace given by 
W :={z € R" | 2, = 0}. 


Consider P : V > W given by (z),... ,2n) +> (21,...,2n-1,0). Then P 
is a linear map called the natural projection. What is P?? * 


Scanned by CamScanner 


60 Linear Transformation, 


Exercise 4.1.19 Let V := R" and W := R"*). Define y: V 4 w b 

y(x1,---)2n) = (1,.-+ 2,0). Then ¢ is a linear map called the natura} 
inclusion. 

Exercise 4.1.20 Can you think of generalizations of Exercises 4.1.18 and 
4.1.19? 

Exercise 4.1.21 Let V = P, and W =R". For P(X) €V, andaeR 
we let P(a) be the “value” of P at X = a, obtained by substituting io 
X. Let aj,...,Qm be any real scalars. The map T: V > W given by 
TP = (P(a1),... ,P(Qm)) is a linear map. 


Exercise 4.1.22 Let V be a vector space. Fix a basis {v;}7_, of V. To 
define linear forms on V, by Theorem 4.1.3 it is enough to define f (v;). 
_ Define f; € V* by setting 


filty) = ( ifj=i 


0, otherwise. 


Show that { f; | 1<i <n} isa basis of V’, called the basis of V* dual to 
the given basis {vj} of V. ; 
Exercise 4.1.23 Let V = R”. What is the basis of V° dual to the standard 
basis of R"? 

Exercise 4.1.24 Let V be the vector space of solutions f of “' 

y) + Py(z)y") +--+ + Pa(z)y = 0. 

(Refer to Exercise 2.1.22.) Let W = R*, for a fixed k with 1 <k <n, 
Consider T : V + W given by Tf = (f(0), f/(0),.-.,f-)(0)). Then 7 


is linear. 

The following exercise is an important one. It introduces a family of 
linear transformations from R™ to R”. 
Exercise 4.1.25 Let A := (a;;) be an n x m matrix with real entries. We 
let V := R™ and W := R". We write the elements of these vector spaces 
as column vectors (rather than row vectors as we have been doing so far); 


TZ) 41 
T2 Y2 
z=| . | ER, andy=| . |] ER". 
Pi : : 
Zm Yn 


We define Tx := Az, the matrix multiplication of n x m matrix A by the 
m X 1 matrix x thereby getting an n x 1 matrix. The latter is considered 


as a vector in R”. Show that T is a linear map. 


Scanned by CamScanner 


Representation of Linear Maps by Matrices 61 


Convention. Whenever we write Az where A is an n % m matrix and 
xz € R™, we assume z is written as a column vector or an m x 1 matrix so 
that the matrix multiplication is defined. 


Exercise 4.1.26 Go through Example 4.1.11. Can you reformulate the 
result there in light of Exercise 4.1.25? 


Exercise 4.1.27 Let Mnxm(R) denote the set of all n x m real matrices. 
From Pxercise 4.1.25, we have a map A+ T (the linear map associated to 
A) from Mnxm(R) to L(R™,R"). Investigate whether this map is one-one, 
onto. 


Exercise 4.1.28 Let A be as in Exercise 4.1.25. Let V and W be vector 
spaces with dim V = m and dim W = n. Fix a basis {v;} of V and {w;} of 
W. Can you associate a linear transformation T from V to W? 


Exercise 4.1.29 Let T: V — W be linear. Let V* and W* be their duals 
(see Definition 4.1.3). We define a map T*: W* > V° as follows. Given 
gE W', Tg EV" is given by T'g(v) := g(Tv) for all v € V. Show that T° 
is a linear map. It is called the adjoint of T. 


4.2 Representation of Linear Maps by 
Matrices | 


Let V be an m-dimensional vector space and W be an n-dimensional vector 
space. Let {v,,.-.,Um} and {w,,...,wWna} be bases of V and W respec- 
tively. The aim of this section is to show that there exists a bijective linear 
map from L(V,W) to Mnxm(R). This fact was hinted at in the last few _ 
exercises of the previous section. 


Let T: V > W bea linear map. Then T(v;) € W. Therefore, 
n 
T(vi) =) aijwj, 1S ism. 
j=1 
We write this in an expanded form 


TY, = 011, +++ + ay nWy 
Tv2 = 421W; +++: + d2nWp 


(4.2.1) 


TUm = QmiWy +++ + Amn Wn. 


Scanned by CamScanner 


62 Linear Transformat; 
Oy 


We define the matrix M&(T) of T with the choice of bases {u;} 


p : of v 
{w;} of W’ to be the transpose of the matrix of the coefficients 7 & 
tion (4.2.1). That is, n Egy, 
Qj) 42, +++ Am) 
Q@}2 422, «+s «= Ame 
M°(T) = 
(7) Qi AQ ee Ami 
Qin A2n +++ Amn 


The matrix Mj(T) is called the matrix associated with T with tes 
to the bases {t,,...,Um} and {wy,... Wr}. Also, Mi (T) is called, 
matrix representation of T with respect to these bases. Note that F 
matrix M%(T) is an n x m matrix whose first column is the Coefficients g 
Tv, when expressed as a linear combination of w; and so on. | 


The matrix: Mi(T) is the n x m matrix whose ith 
column is the coefficients of Tv; when expressed as a linear 


combination of w;, 1 <i¢m. 


This is the secret recipe which allows you to write the matrix of a lines 
transformation with respect to the given bases. Let us put it into use. 


Example 4.2.1 Let V = R? = W. Consider the linear map 
(z,y) 4 (z+y,2—y). 


We use the standard basis {e; = (1,0), e2 = (0,1)} on both V and W, We 
want the matrix associated to T with respect to these bases. We prepare 
the matrix using the recipe 


Te, = (1, 1) 
Te2 = (1, -1) 


(1,0) + (0,1) =1-e, +1-e 
(1,0) + (0,-1) =1-e; —1- ep. 


i] 


The first column of M;(T) is therefore (:) while the second is ( | 


Hence 
I d 
MS = 


To consolidate this idea, do the next couple of exercise before you proceed. 


Scanned by CamScanner 


Representation of Linear Maps by Matrices 63 
EE 


Exercise 4.2.1 Let the notation be as above. Let 
T(z, y) = (az + by, cx + dy), 


Find the matrix representation of T with respect to the standard basis. 


Exercise 4.2.2 Let V = R? and W =R‘*. Let T be defined by 
T(z, y) = (x,y,z +Yy,r— y). 


Find the matrix of T with respect to the standard bases. 


Exercise 4.2.3 Let {v1,...,Un} be a basis of R". Find the matrix 
A € M(n,R) when considered as a linear map from R” to itself which 
takes the standard basis vector e; to v; for 1 <i<n. 


Let us do one more example which will be an eye opener. 


Example 4.2.2 Let T: R? -+ R? be defined by T(z,y) = (r+ y,z— y). 
We choose {v; = (0,1), v2 = (1,0)} as a basis for both the domain and 
range of T. What is the corresponding matrix of T? Let us do it in a 
systematic way as earlier: 


Tv; = (1,—-1) = (0,-1) + (1,0) = (-1)-u. +1- 
Tv2 =(1,1) = (1,0) + (0,1) =1-u, 41-9. 


-1 1 
Pay 
Compare this with that of Example 4.2.1. Do you see the difference? Even 
though the basis in Example 4.2.1 is same as that in Example 4.2.2 as sets, 


the order In which they are listed seems to matter. The lesson we learn 
from it Is 


Thus the required matrix is 


In writing the matrix associated to a linear map with 
respect to the given bases, the order in which the elements 


appear In the lists matter. 


Just to make sure that you appreciate this, redo Exercise 4.2.1 with the 
basis {ey,e;} and compare the new matrix with the earller one (see also 
Exerelse 4.2.12), : 


Scanned by CamScanner 


64 Linear Transformations 
a, 


Example 4.2.3 Let p; be the map p,: R? + R by pi(z,y) = a. Then 
pi(ey) = pi(1,0) = 1 and p,(e2) = p:(0,1) = 0. Therefore the matrix 
associated with p; is the 1 x 2 matrix (1,0). 


Example 4.2.4 Recall the linear map 7, the multiplication by 7 in C (see 
Example 4.1.7). As a basis of C we take v, = 1 and v2 = i (see Exer- 
cise 2.3.22). What is the matrix representation of T with respect to this 
basis? 


Ty, = tl=t=0-'y4+1+% 
Tv2 


i-i=-l=—-v, =(-l). +00. 


M?(T) = i aD | 


Exercise 4.2.4 More generally, let \ = a+ib € C be a complex number, 
Consider the map T: C > C given by Tz = Az, the complex multiplication 
of z by A. Show that T is a linear map. Compute the matrix of T with 
respect to the “usual” basis of C. 


Thus 


Exercise 4.2.5 What is the matrix representation of the conjugation map 
z++Z of C with respect to the “usual” basis of C? 


Exercise 4.2.6 If A: R? > R° is given by 


r 7 \. 
A (;) =|zty]l, 
y 
write down the matrix of A with respect to the standard basis of R? and 
R’. 
Exercise 4.2.7 Let p,;: R? + R? be the projection of R? onto its subspace 
Re, defined by p;(z, y) = (x, 0). The matrix associated with p, is 


1 0 
0 0/° 
Generalize this to the projection of R” onto its subspace 


W :={ceER" | 24;=0,i1>m} forrl<m<n. 


Exercise 4.2.8 Let pj: R’ + R’, i = 1,2, be defined by p,(z, y) = (2,0) 
and po(z,y) = (0,2). Show that p; op2 = 0 while p20 p, is not. In 
particular, composition of linear maps is not commutative: $ oT need not 
be ToS, 5,T € End(V). 

What are the matrices of p; with respect to the standard basis? Do they 
commute? 


Scanned by CamScanner 


Representation of Linear Maps by Matrices 65 
pnd niin ttc at nt cd 


Exercise 4.2.9 Let V be a vector space and {v,... , Un} a basis of V. Let 
o: V + V bea linear transformation defined by 
(U1) = v2, $(v2) = U3,.-. 1 O(Un—1) = Uns O(Un) = 0. 
Let A be the matrix associated with ¢. We have 
(v1) = v2 = Ov, + lug + 0v3 +--+ + Op 
and | 


$(v;) = Uj41 = Ov, + Ove +++ + lojay + Ovje2 +--+ + Ov,. 


_ Then we have 


> 

ll 

of 
_ FO 
oo 
oo 
oo 


We have ¢" = 0 and A” = 0. 


Exercise 4.2.10 Let r: {1,2,...,n} — {1,2,...,n} be a permutation 
and let V = R". Let ¢,: R" — R® be defined by e; + e,(;). Extend ¢, 
linearly. As a specific example consider the case of R°, and let 


Then, $r(e1) = er(1) = €3) Pr(€z) = er(2) = €1 and $,(e3) = erg) = en. 
Hence the matrix of ¢, is given by 


1 0 
gr={0 0 1 
1 0 0 


Matrices corresponding to permutations are called permutation matrices. 


We now proceed to investigate the relationship between linear maps and 
the associated matrices. 
Let the notation be as in the beginning of the section. Let 


411 412 +++ Alm 

Q21 422 «++ Qam 
A=|. 

Qn1 Qn2 «++ Onm 


Scanned by CamScanner 


5 Linear Transformatio 


be an n x m matrix. Is there a linear map 7: V > W whose matrix w; 
respect to the given bases is A? If such a7’ exists, we then must ha 
Tu = 7, ajiwj. Once T is defined on the basis of V, we can extend 
“linearly” to all of V (see Theorem 4.1.3). 

More specifically, if v:= 7", Arv,, then 


Tv = So eT Ur 


where 0; is the (j1)th entry of the matrix product 


Qj, 32 «++ Aim\ [a 
Qa, 72 «++ Gam | | Az 


Qn} Gnd? «++ Anm Am 


By the very definition, 7’ is a linear map whose matrix is A. 
We have thus proved the following result, 


Proposition 4.2.1 Let A be ann x m matric, Fir bases {4}, of V 
and {w,}f, of W. Letv€ V, v= D2, Aim. Define a map T such thai 
T(v) = Dojay bjwy, where the b's are defined as bj = rn UjrAns That ta, 


aj, see Gim\ ft by 


‘ ‘ ‘ ' — 
‘ = ‘ 3 


Oy] one Gum Amn by 


Then T: V + W isa linear map auch that the matric of T with respect to 
_ the bases {vy} and {wy} is A. 


Bxerelse 4.2.11 Let V and W be vector spaces. Fix a basis {uy}, of V 
and a basis {w,}f1 of W. Then the map g: L(VW) + Myson (IR) elven 
by TH M¥(T) is a bijective (one-to-one atid ofito) linear itap, 


Scanned by CamScanner 


Representation of Linear Maps by Matrices 67 


Theorem 4.2.2 Let V,W,U be vector spaces. Let {v,,-.. ,Um} be a basis 
for V, {wiy.++ Wn} be a basis for W and let {uy,... ,up} be a basis for U. 
Let T: V ~7W, S: WU be linear maps. Let A and B be the matrices 
associated with T and S respectively. Then the matriz associated with SoT 


is AB. 
Proof We have 7’; = Lj ajjwj and Sw; = ae bjpu,y. Hence 


5 (Som 
j=l 
Spa 
j=l 
n k 
= So aij (>. br] 
j=! =] 
k n 
= » (33 “i Ur 


r=1 \j=1 


So T(u;) 


where cir = ))j=1 4ijbjr, the irth entry of the product matrix AB. 
0 


Exercise 4.2.12 Let A: V > W bea linear map. Fix bases {v,,... , Um} 
of V and {wy,...,Wn} of W. We denote by M(A) the matrix of the linear 
imap with respect to these bases. Find the matrix of A with respect to the 
following bases: 


(1) The basis of V is {vj, 0, 04,...,0),} where vu} = v2, vu) = 4 and 
v; = y for i > 3 and the basis of W is as given. 


(2) The basis of V remains as’ given but the basis of W 1s {w}} where 
wy = wy and wh = w and wy = wy If k Ad and k ¥ J, 


(3) The basis of V ts {vj} where vj = v) + v2 and vj = yj for i > 2 and 
the basis of W is as given, 


(4) The basis of V Is {avj, 02)... 0m}, a #0 and the basis of W remalns 
utichanged, 


Scanned by CamScanner 


68 Linear Transformation, 
eee" 


Exercise 4.2.13 Let the notation be as in Exercise 4.1.29. Fix bagog 
{v1,...,Un} of V and {wy,...,wn} of W. Let fi and gj be the dual 
bases of V* and W° respectively (see Exercise 4.1.22). Is there any relation 
between the matrices M}(T) and Mj(T")? 


4.3 Kernel and Image of a Linear 
Transformation 


Definition 4.3.1 Let T: V + W bea linear map. Then the kernel of 7 
denoted by ker T, is defined by 


kerT={veEV | T(v) =0} 
and the image of T, denoted by Im T, is defined by 
In T:=T(V)={weW | there exists v € V such that T(v) = w}, 


Exercise 4.3.1 kerT is a subspace of V and Im T is a subspace of W, 


Definition 4.3.2 Let T: V + W be a linear map. The dimension of the 
vector subspace kerT' (respectively the dimension of ImT) is called the 
nullity (respectively the rank) of T. That is, 


nullity of T = dim kerT 
rank of T = dimImT. 


Example 4.31 We wish to find the range and kernel of T : R? ~ R? 


defined by 
G) G25) 
a ee 
y zI-y 


We have ker(T) = {(z,y) € R? | T(z,y) = (0,0)}. Now T(z,y) = (0, 0) 
implies that z + y = 0 and x — y = 0 which implies that z = 0, y = 0, 
Hence ker(T) = {(0,0)}. 

Let (x,y) € R?. Does there exist (21, y,) € R? such that T(z1,y1)=(2,y)} 
That is, can we find (2), yx) € R? such that (7) + 1,21 — y1) = (z,y). To 
find such 2}, y;, we need to solve the system of linear equations 


z= 


Vy 


y+ 


1 -Y 


(i aty zy 
On solving, we find that 2, = <a. and y; = ——. Therefore given any 


(x,y) € R’, we can find (21,y1) € R’ such that T(x,,y,) = ( 


r,y). H 
the range is R?. y). Hence 


Scanned by CamScanner 


Kernel and Image of a Linear Transformation 69 
—_- er _— 
Example 4.3.2 Find the range and kernel of T : KR? - R? defined by 


x 101 £ r+z 
yjr{1 1 2 y)=| z+y4+2z |]. 
z 21 3) \z 2r+y+3z 


ker(T) = {(z,y,z)€R° | T(z,y,z) = (0,0,0)}. 7 
Now T(z, y, 2) = (0,0,0) if and only if 


z+z= 0, 
r+y+2z = 0, and 
2r+yt+3z = 0. 


Hence z = —2, y =z. Therefore, 
ker(T) = {(a,a,-a) | a €R}={a(1,1,-1) | ae R}. 


Let (x,y,z) € R°. If (x,y,z) € Im (T), then there exists (x1, 4,21) € R° 
such that T(21, y1, 21) = (z,y, 2), or 


(xy + 21,2) +91 + 221, 22; + y: + 321) = (z, y, z). 


Therefore 2} +2, = zSo that z} = c—z,. Now, zr} +y, +2z,; = y and hence 
ay t+y)+2(z-21) = y. That is, —z,+y, = y—2z. Finally, 271; +y,+3z, = z 
and so 22, + y, + 3(z — 21) = z or, —z1 + y,; = z — 3z. Therefore we get, 
y — 2x2 = z—3z and hence y+ z =z. Thus if (z,y,z) € Im(T), z+y=z 
and hence 


Im (T) = {(z,y,2+y)} = {2(1,0,1) + y(0,1,1) | z,y C R}. 


Remark 4.3.1 Do you see that one of the recurring themes, the solution 
of a system of linear equations, is at work in these examples? 


Exercise 4.3.2 Let T : R? — R® be defined by 


0-(3) 


Find the range and kernel of T. 


Exercise 4.3.3 Find the kernel and image of each of the linear maps of 
the exercises in Section 5.1. 


Scanned by CamScanner 


70 Linear Transformations 
Bd) _ 1 SUE SIO ASONE 


Exercise 4.3.4 True or false? If V, W are vector spaces and T: V 4 yw 
is a linear map and {v,,... , Un} is a linearly independent set of vectors in 
V, then {7(v,)}!, is linearly independent. 


Exercise 4.3.5 If {e,,... ,en} isa basis of a vector space V, and yy,... , y, 
are arbitrary elements of a vector space W, then we have already seen that 
there exists a unique linear map T such that Te; = y; (see Theorem 4.1.3), 
If {v,,...,U,} is an arbitrary set of vectors in V, and {w),..., Wn} is an 
arbitrary set of vectors in W, does there exist a linear map T: V 4 W 
such that Tv; = w;? 


Exercise 4.3.6 If W is a subspace of a vector space V, and T:W +4 Xa 
linear transformation, does there exist a linear map T : V - X such that 
T(w) = T(w) for all w € W. Ifso, how many? Hint: Look first at V = R?, 
W = Re, = {(z,0) | z¢R} and X =R (recall Theorem 4.1.3), 


Exercise 4.3.7 If V and W are vector spaces and 5,7: V > W are linear 
transformations such that ker(7) = ker(S) and Im(7) = Im(S), is S = T? 


Exercise 4.3.8 Let V and X be vector spaces. If W is a vector subspace 
of V, does there exist a transformation T : V > X such that ker(T’) = W? 
Hint: Theorem 4.1.3 and Theorem 2.3.8. 


In the examples considered so far, we see that 
dim V = dimker(7’) + dim Im(T), 


In fact, we have the following theorem: 


Theorem 4.3.1 (Rank-Nullity Theorem) Let V and W_ be. finite 
dimensional vector spaces and T ; V -+ W be a linear transformation, 
Then 


dim V = dim Im(7') + dim ker(7') = rank 7 + nullity 7’ (4.3.1) 


Proof Let {ty,...,t} be a basis for ker7’, This can be extended to a 
basis of V, say, (tty) s+ thy U1y+6 Un }s Wenow prove that {7'uy,... )Z'vq} 
ls a basis of Im(7'), Let w€ Im(7'), Then thore exists v € V such that 


T(v) = wv. Now v= Ee cyt + 3/1 Ay. Therefore, 


i (< cytly + 441) 

YT (u) +S B/T(v,) 

Y- BjT(vy), since ty € ker(7"), 
This dimplles that (704)... 70s} spans Im(7'), 


w= T(v) 


ID 


li 


Scanned by CamScanner 


Kernel and Image of a Linear Transformation 71. 


We claim that {Tv;} is linearly independent. Let )7?_, aiTv; = 0 for 
some scalars a;. Then T(}>;_, aiv;) = 0. This means that 


S5 avi € ker(T). 
i=] 


But then Soy, ais = De By; since {u,,...,u,} is a basis of ker(T). 
Therefore, 


Since {u1,... ) Uk) V1,-++ Un} is a basis of V, we see that a; = 0 and 2; = 0 
for all i and j. Hence the claim follows. 

Thus {Tv;}", is linearly independent and spans Im (J) so that dimen- 
sion of Im(T) is n. Now, dimIm(T) = n = n+k-—k = dim V —dim ker(T). 
Hence, we have dim V = dim ker(T) + dim Im (T). 

0 


Example 4.3.3 Let us do Example 4.3.2 again. We have found that 
dimkerT = 1. Proceeding as in Example 4.3.2, we find that if (z, y, z) 
is in Im(T) then we have to solve for the system 


+2; = 
T+ Y + 22 
22, +y1 +32; = 


Il 
xn © 8 


Subtracting the second equation from the third, we get 2, 2 4 =2-y. 
This along with the first equation implies that r = z- yorz=z+y. 
Thus, the Im(T) C W := { (x,y, z) eR | t= 2 + y}. This is a vector 
subspace of dimension 2. Now Equation (4.3.1) implies that dim Im (T) = 2: 
dim R° = dim ker(T) + dim Im (T) = 1+dimIm(T). Thus Im (T) is a two- 
dimensional vector subspace of the two-dimensional vector space W and 
hence Im (T) = W. 

This is an example of how one uses theory to cut down excessive 
computations, shorten the arguments and gain insight. 


Exercise 4.3.9 Let V = R" and A bean x n matrix. If Ax = 0 has a 
unique solution then Az = b has a unique solution for every b € R”. 


Exercise 4.3.10 Can you construct a linear transformation T : R? > Rt 
such that Im(T) = { (11, £2, £3, 24) eR! | I) +29+23 +24 = 0}? 


Exercise 4.3.11 Can you construct a linear transformation T : R? 3 R3 
such that Im(T) = {(z,y,z)€R*? | c+y+z=0}? 


‘Scanned by CamScanner 


we 


72 Linear Transformations 
ia ici iaaacpemenal neeRnmmete eee e 


Exercise 4.3.12 Let 7: V > V bea linear map such that Im 7 = ker’ 
What can you say about T?? (By the way, can you construct such a map 
from R? to R??) 


4.4 Linear Isomorphism 


Definition 4.4.1 Let V and W be vector spaces over R. A linear map 
T: V + W is said to be a (linear) isomorphism if T is one-one and onto, 
We then say V is isomorphic to W. 


Exercise 4.4.1 If T: V + W is an isomorphism, then the set theoretic 
inverse T~!: W — V is linear and an isomorphism. 


Exercise 4.4.2 Isomorphism is an equivalence relation. This means the 
following: 


(i) V is isomorphic to itself. 
(ii) If V is isomorphic to W, then W is isomorphic to V. 


(iii) If V is isomorphic to W, W is isomorphic to U, then V is isomorphic 
to U. 


Before we say why this concept is important, let us look at some examples, 
Example 4.4.1 Let V be the space of polynomials (with real coefficients) 
of degree less than or equal to n and W be R"*!, Then the map 

: . 
T:P: y > aiX! H (a9,+++ Qn) € Rt! 
i=l 
is an isomorphism. 


Example 4.4.2 Let V be any vector space over R. Let JT: V + V be 
defined by T(z) = az, a #0. Then T is a linear isomorphism, 


Example 4.4.3 Let V = Myaxm(R) be the set of n x m matrices with real 
entries and let W = R™". Define f: V > W by 


f(A) = (Giger y%X1m,A21,+++ Amy... yAnlyeee »Onm)s 


Here A = (aj) € V. It is easily seen that f is an isomorphism. 


Scanned by CamScanner 


Linear Isomorphism 73 


Example 4.4.4 This is a slightly more abstract isomorphism, worth 
learning thoroughly. Let V and W be n-dimensional vector spaces over R. 
Let {uj }71, (respectively {w,}?_,) be a basis of V (respectively W) over RB. 
Given « € V, we can then write z = )° ayy. We let f(z) = > a,w,. That 
is, f(ui) = wi and extend this linearly over R. Then f is an isomorphistn 
of V onto W. We leave the proof of this assertion to the reader. 

We invite the reader to check that all the isomorphisms in the previous 
examples were obtained this way. In Example 4.4.1, {P, = X'}%, is a 
basis of V. T maps X' to e; € R"*! and extended linearly to all of V. If 


PeEV, P= aiX', then 
= \ aif (X') = S > ies = (ao, @,--- On) eR! 


We leave the other cases as exercises to the reader. The theorem below 
says that any isomorphism between V and W “arises” this way. 


Theorem 4.4.1 Let T: V — W be a linear map. Then T is an 
isomorphism if and only if {T(u) )}fe1 is a basis of W for any basis {v;}%, 
of V. 

Proof Let {v;}fL, be a basis of V and let T be a (linear) isomorphism. 


We need to prove that {T(v;)}/, is a basis of W. 
We first of all show that {T(v;)}"_, is linearly independent. If 


Y- aT (v;) = 0 


for some scalars a; € R, then by linearity of T, we have 


Tartu) =7(Taw) =. 


Since T is one-one and T(0) = 0, we see that )’aju; = 0. But {v,;}*_, is 
a basis of V and hence }> a;v; = 0 if and only if a; = 0 for all i. Thus 
{T(v;)}#, is a linearly independent set. 

To prove that {T(v;)}7., is a basis, it is now enough to show that 
{T(v;)}f_, spans W. Since T: V > W is onto, given w € W, there 
exists v € V such that T(v) = w. Write v = )>aju;. Then by linearity, 


w=T(v)= T (oa vi) = S aT (v5). 


This implies w is a linear combination of T(v;)'s. Thus {7'(v;)}"_, is a basis 
of W. 
The converse is essentially Example 4.4.4 and hence left as an exercise. 
0 


/ 


Scanned by CamScanner 


74 Linear Transformations 
MA ne rahitormationig 


Exercise 4.4.3 Isomorphic vector spaces have the same dimension, 


Exercise 4.4.4 Let T: R? > R? be given by T(x, y) = (ax + by, cx + dy). 
Then T is an isomorphism if and only if ad — be # 0. Write explicitly 7-1 
when it exists. What are the matrices of T and T~' (whenever the latter 
exists) with respect to standard basis of R?? 


The importance of the concept of linear isomorphism is as follows. Let 
f:V 3 W be an isomorphism. Let us assume that we want to solve the 
vector equation: Find x; € V such that the equation ‘3 j %j7j = B;, where 
ai; € R, 8 € V, is satisfied. We can solve this in V, that is, find 2;, if and 
only if we can solve the equations ¥* ajjyj = f(6i) in W. It may happen 
that this second system is readily solvable. Then we take x; = f (ys) and 
these z;’s solve the original system. 


Exercise 4.4.5 Let T: V — W be linear. Then T is one-one if and only 
if ker(T) = 0. Hint: If Tz = Ty, then T(z - y) = 0. 


We include a proof of the following theorem for completeness sake 
However, we urge the reader to write out a proof on his/her own. . 


Theorem 4.4.2 Let V be a finite dimensional vector space. LetT : V 4 y 
be a linear map. Then the following are equivalent: 


(1) T is an isomorphism. 
(2) kerT = {0}. 
(3) Im(T) =V. 


Proof Assume that (1) holds. Since T(0) = 0 for any linear map, if v € V 
is such that T'(v) = 0, then v = 0 since T is one-one. 

Assume that (2) holds. To show that ImT = V. By the rank-nullity 
theorem (Equation (4.3.1)), we have 


dimImT = dim V — dimker T = dim V — 0 = dim V. 


Since ImT is a vector subspace of V, by Exercise 2.3.14, ImT’ = V. Thus 
(3) follows. 

Assume that (3) holds, that is, T is onto. Hence dimImT = dimV. To 
prove that T is an isomorphism, it is therefore enough to show that it is 
one-one. If 7 is not one-one, then there exist v; € V such that v, Fv 
and T(v;) = T(v2). This implies that T(v; — v2) = 0, by linearity of T. 
Thus a nonzero element vj — v2 € kerT so that dimkerT > |. Using the 
rank-nullity theorem Equation (4.3.1), we get 


dim Im T = dim V — dimkerT < dim V. 


Scanned by CamScanner 


Linear Isomorphism rT 


This contradicts our assumption that dimImT = dimV. So, we conclude 
that 7 is one-one. 
0 


Remark 4.4.1 The reader should compare this result with the following: 
Let X be a finite set. Then a map f: X — X is a bijection if and only if it 
is one-one if and only if it is onto. In fact, the reader can supply a proof of 
Theorem 4.4.2 using this fact and Theorem 4.4.1. In light of the fact that 
such a result is false in the case of infinite sets, he may want to investigate 
the validity of the theorem for infinite dimensional vector spaces. 


Exercise 4.4.6 Any n-dimensional vector space is isomorphic to R" and 
then any two n-dimensional vector spaces are isomorphic. 


Exercise 4.4.7 Let T be the map as in Example 2.1.3. Then T is an 
jsomorphism. 


Exercise 4.4.8 The map in Exercise 4.2.11 is an isomorphism. Hence 
conclude that dim L(V, W) = dim(V) x dim(W). 


Exercise 4.4.9 Let V denote the space in Exercise 2.1.22. Let T: V 3 R® 
be defined by 
Tf = (f(a), f'(a),... ,f?7*(a)). 


Then T is a linear isomorphism. Hint: You need results from the theory of 
linear systems of ordinary differential equations. 


Exercise 4.4.10 Let V be a vector space and W a subspace of V. Then 
there exists an onto linear map T: V + V/W such that W = kerT. 


Exercise 4.4.11 Let V and U be vector spaces and T: V > U bea linear 
map with kernel K. Let W =ImT. Then W ~ V/K. (This is called the 
Fundamental Theorem of Homomorphisms.) 


Exercise 4.4.12 If W is a subspace of V, every subspace of V/W is of the 
form T/W where T is a subspace of V containing W. 


Exercise 4.4.13 This is an extension of Exercise 4.2.12. Let {v,,..., Um} 
and {vj,--.,Um} be two bases of V and {wy,...,wp} and {wj,... ,w/} 
be two bases of W. Let M¥(A) (respectively, M’,(A)) denote matrix of a 
linear transformation A: V + W with respect to the bases {uj,... , Um} 
and {wj,... ;Wm} (respectively {v},...,vj,} and {w;,... ,wi}). Find the 
relation between Aj’ and A%,. 


Scanned by CamScanner 


76 Linear Transformations 


4.5 Geometric Ideas and Some Loose Ends 


Given an nxm matrix A = (a;;), one may think of it as a listing of (column) 


vectors from R™. For, we may write A = (Cj,... ,Cm) where 
aij 
C;= |: 
Qnj 


is the jth column. In particular, we think of A as the linear map which 
takes the jth vector in the standard basis of R” to C; € R": Ae; = C;, 
1 <i <-m. We find this way of looking at matrices quite georhetric and 
useful on many occasions. 

The choice of a basis for a vector space V allows us to set up an isomor- 
phism from V to R", where n = dimV. Let {t,... Un} be a basis of y. 
Write v = )>;_, ajv;. Then the map T: V — R” defined by 


is a linear isomorphism. (Isn't this Exercise 4.4.6?) 
Let V and W be vector spaces, T: V + W a linear map. Let {uj}™ 1 be 
a basis of V and {w;}%_, a basis of W. We want to look at the matrix of 
T with respect to these bases in a geometric way. 
If A = (a;;) is the matrix representation of T with respect to these bases 
then the column vector ; 
Qj 
= 
Qin 
stands for the vector y; = )> ajiw; € W. 
We shall put these ideas into use to prove Theorem 4.5.1. 
Let A = (a;;) be an m x n matrix. We write 


Ry 
Rn 
where FR; := (ai1,--- ,@in) is the ith row. Similarly, we write 
Qj; 
A=(Ci,...,Cn) where C;= 
Om j | 


—_——— | 


Scanned by CamScanner 


Geometric Ideas and Some Loose Ends Hf 


is the jth column. 

We may consider R; (respectively, C;) as a row vector in R” (respectively 
as column vector in R™). The vector subspace spanned by R;’s (respectively 
C;’s) is called the row space (respectively column space) of A. 

The row rank of A is defined to be the dimension of the vector subspace 
of R" spanned by R;’s. The column rank of A is defined to be the dimension 
of the vector subspace of R™ spanned by C;,'’s. 

If we think of A as the linear map which takes the jth element of the 
standard basis of R” to the jth column Cj, then the column space is nothing 
other than Im (A). Hence the column rank of A is the dimension of Im (A). 


Exercise 4.5.1 Compute the row and column rank S$ of 


1 0 1 0 2 100 0 
iV 0 2 2. 02 2) 3 [001 OJ. 
0 0 00 0 000 0 


The main result of this section is the following theorem: 
Theorem 4.5.1 The row rank and the column rank of an m x n matriz 
A = (aij) are equal. 
Proof Let k be the row rank of A. Let {v,..., 0%} be a basis of the row 


space of A. We then can write Rj = Da aij. Let ty = (bris-++ sben) 
for 1 <r <k. Since 


k k 
R; = (ai1, Oi2)- e » Qin) = So Girt, = Yo air(bet,--. plea); 
r=1 r=1 


we get an equation among the coordinates 


k 
Qi; =) airbe; for 1 <t< mand 1 Sjgn. 
r=) 
That is, 
aj = a3); + 122; +eee+ OK; 
G23 = 21b1;5 + A2ab25 +++ + andy; 
mj = Amib1j + Omabej +--+ + Omk DK; 


Hence, we get, forl <j <1, 


1; O11 OE 
25 O21 25, 
ae eos 9 a Ve 
A: 


Scanned by CamScanner 


78 Linear Transformations 


But the left side of the above equation is Cj. Thus any column is a linear 
combination of & vectors. Hence the dimension of the column space is at 
most &. Thus the column rank is less than or equal to the row rank, 

A similar argument yields the reverse inequality. Hence the result, 


0 


Exercise 4.5.2 Complete the proof of Theorem 4.5.1. 


Definition 4.5.1 The common value of the row and column ranks of Ais 
called the rank of A and denoted by rank A. 


Exercise 4.5.3 For an m x n matrix, what is the largest possible value of 
rank A? 

Exercise 4.5.4 If A is an 11 x 7 matrix, show that the rows of 4 are 
linearly dependent. 


Exercise 4.5.5 If A is a 3 x 5 matrix, show that the columns of 4 are 
linearly dependent. 


Definition 4.5.2 We say an nxn matrix A is non-singular if rank A = pn. 


Exercise 4.5.6 Show that A is non-singular if and only if A is invertible. 
That is, there exists a matrix B such that AB = BA =I. Hint: There is 
a linear map corresponding to A. Rank A = n says something about the 
linear map. 


Exercise 4.5.7 Let A be an m x n matrix and B an n x k matrix. Show 
that the rank of AB is at most the minimum of the ranks of A and B. 
Can it be strictly less than the minimum? Hint: Look at the linear map 
T: R? — R? which maps e; to e2 and e2 to zero. Let A be the matrix of 
this linear map. 


Exercise 4.5.8 Let the notation be as in Exercise 4.5.7 True or false? If 
A is of rank r and B is of maximal rank, that is, the rank of A is min{n, k}, 
then the rank of AB is the rank of A. 


4.6 Some Special Linear Transformations 


In this section, we show how a geometric object transforms under some 
special linear transformations. The idea behind this exercise is that the 
reader will learn how to look at linear maps as geometric maps. 

In the following series of examples, we show how the geometric picture 
“FE” in R? transforms under certain linear maps of R?. The letter E is 
considered as the subset, some of whose special points are the origin, €1, 
(0,1/2) , (1,1/2), e2 and e; + ez (see Example 1 on page 80). | 


Scanned by CamScanner 


Some Special Linear Transformations 79 
Se 


We explain by means of an example how the picture of T(E) is drawn 


where a. B 
0 2 


T= 
is as in Example 5 below. First note that we can write 


-1 0\/1 0 
P= / 0 aye 2) = AB Say. 


Clearly, B sends e; to e; and e2 to 2e2, or more generally, sends (z, y) 
to (z, 2y). Thus it stretches the vector (z,y) by a factor of 2 in the y- 
direction. Obviously, A is the reflection with respect to the y-axis. Thus 
T is a composition of these two maps. Now what are the images (under 
T) of the special points listed above? Using the fact that T(e;) = —e; and 
T (ez) = 2€2, we see that 


0, T(e;) 


T(0) = = —€1, 
T(Oe; +€2/2) = e2, T(e; +e2/2) = -e, +e, 
T (e2) = 2e2, and T(e; +e2) = -—e; +26. 


Thus the end points of the lower arm of E go to 0 and —e;. Since T is 
linear, it maps the points on the line segment joining 0 and e; into points 
of the line segment 0 and —e). (In fact, if p divides the line segment [z, y] 
in the ratio t:1—-t, (that is, p = tr + (1 —t)y, then Tp divides the “line 
segment” (Tz,Ty] in the ratio t : 1—t. It is possible that Tz and Ty 
are scalar multiples of each other and hence the reason for the quotation 
marks). Thus under T the lower arm goes to the lower arm of TE as shown 
in the figure. Proceeding in a similar way, the reader can show that the 
image TE is as shown in the figure in Example 5. 

The reader must convince himself of the validity of the figures of the 
other examples, given on pages 80-85, in an analogous manner. 


Scanned by CamScanner 


Linear Transformation, 


A 


80 ign 


—_—_— 


1. Identity: 
] 
Ay = if = i ) 


2. Stretching along e; (z-direction): 


2 0 
= (5 | 


| 


Scanned by CamScanner 


gome Special Linear Transformations 


3, Stretching along ¢2 (y-direction): A; = é 4 


4. Reflection with respect to y-azis: 


-1 0 
As=\ 9 1 


Scanned by CamScanner 


Linear Transformy , 
iy 


5. Stretching along y-direction and reflection with respect to yet 


-1 0 
y= ( 0 4 


6. Rotation by an angle 0: 
cos§ —sin§ 
sind _—_cos@ 


“g—- 


y 
) 


pa) ee 


Scanned by CamScanner 


Some Special Linear Transformations a3 


til 


7. Rotation by 6 = x/2: 


cos% —sin% 
Ay=|.. 3 4 
sin cos! 


2 ] 


9. Shear and rotation: 


0 -1 
w=(1 7) 


Oo 


Scanned by CamScanner 


Linear Transforma{ 


10. Rotation and stretching: 
0 -l 
Aio=\o 0 


li. Stretching and rotation: 


mel 


Lif 


Scanned by CamScanner 


Some Special Linear Transformations 


12. Projection: 


1 0 
Az=\g 9 


13. Projection and rotation: 


0 0 
Ai3 = (; i) 


0. 


a“ 


14. Projection and stretching: 


2 0 
a (5 5) 


Scanned by CamScanner 


5. Inner Product Spaces 


In our study so far, no metric concepts such as length, angle and dista; 
were encountered. In this chapter, we shall study a special class of vec 
spaces which is very rich in geometry. In fact, a model of Euclidean Gec 
etry is provided by these spaces. We shall first look into the most fami 
of these, namely the Euclidean plane. This will allow us to fine-tune , 
geometric insight. 


5.1 Inner Product Spaces 


5.1.1 The Euclidean Plane and the Dot Product 
We shall recall the length of a vector in R?. Let 
z= (x1, 22), y= (v1, y2) € R?. 


By Pythagoras theorem, OA? + AP? = OP? or 23 + 23 = OP? ( 
Figure 5.1.1). 


O TZ} A 


Figure 5.1.1 Length of a vector in R?, 


86 


Scanned by CamScanner 


Inner Product Spaces : 


We define the length or norm of x as the positive J2zj + z4 and denote 


it by I|=II- 
Again, by Pythagoras theorem (see Figure 5.1.2), we have 


2 
lz - yl" = (21 — 11)? + (zo - py? 


Hence ||z - yll = v (21 — 1)? + (22 — ya)?. Let d(z,y) = (lz - yl]. This 


gives the distance between the points z and y. 


Figure 5.1.2 Distance between z and y. 


All these concepts can be captured by an additional structure, called the 
dot product on R?. 


Definition 5.1.1 Let x = (21,72), y = (y1, y2). Then the dot product of 
z and y, denoted by (z,y) (or, z-y) is defined as (z,y) = 214) + roy. 


Note that we get the earlier notions such as length and distance from the 
dot product: 
(a) If z = y, then (z,z) = 2} +23 = l|z||?. Thus |}z]| = (z,z)?. 


(b) d(z,y) = lz -yll = (e-y,2-9)? = (ei —-npPt+@a—n). 


Now that we know that this new notion captures the old metric concepts, 
we may wonder what its geometric meaning is. 

Let P = (z1,22) = z and Q = (y1,y2) = y. From plane trigonometry 
we know that given two sides of a triangle and the included angle between 
them, the remaining side can be calculated using the law of cosines. From 
Figure 5.1.3, we have 


PQ? = OP? + 0Q? - 20P - OQ cos8. 


Scanned by CamScanner 


88 Inner Product Space 
hr 


By the definition of the length of a vector and the dot product, we haya 
(21 — ys)? + (#2 va)? = lle [l? + lll? ~ 2lzilllvl|coss. 

Simplifying, we get 

(2}+23) +(p3+y3)—2(zrnn +z0y2) = (2 +23) + (ui +¥2)-2 lz Il ly || cos¢ 

or ||>| || yl] cos@ = zy + z2y2 = (z,y) and so (z,y) = ||z|||ly|| cos9. 


Thus the geometric significance of the dot product is: If @ is the angl 
between the lines Rr and Ry, then 


(z,y) = ||z|||lyl] cosé 


2 
(z-y,2-y) =ll2||’ + Ilyll’ - 2II2Il llyll cos 4. 


Q(yi,y2) =y 


0 
Figure 5.1.3 Angle between vectors. 


We now introduce a dot product on R” in a way similar to that on R?, 


Definition 5.1.2 If z =(z),...,2n) and y =(y1,... ,yn) are in R", then 
their dot product (z,y) (or, zy) is defined as (x,y) = or, ziyi. 


We ask the reader to verify that the dot product has the following 
properties: 
(1) (z,x) > 0 and (z,z) =0, if and only if z = 0. 


(2) (x,y) = (y,z), for all z,y € R”. 
(3) (az,y) = a(z,y), for all 2 € R” and a € R. 
(4) (2+ Z2,y)= (2.4) + (z.u). for all z,y,z € RB". 


Scanned by CamScanner 


Inner P roduct Spaces siuiaiieaiesie a. 4G 


ee 
— a 


(2) and (3) imply that (2, By) = 22, y) for zy €B", Be R (2) and (4) 
imply that (z,y +2) = (%,y) + (4,2) for all z,y,2 € BE, 


Definition 5.1.3 Ifa = (z1,... ,2n) € R" then the length or norm of the 
vector x is denoted by ||z|| and given by 


» i 
\2\| = (2,2) = (E21 | 


i=] 


For z,y € BR”, the distance between z and y is defined as 


n j 
d(z,y) = ||t-yll= V(z-y,2-y) = (S=1-mi) ; 


i=] 


5.1.2 General Inner Product Spaces 


We abstract the properties of the dot product on R” in the following 
definition. 


Definition 5.1.4 An inner product or a dot product on a vector space V is 
amap (,): Vx V — R satisfying the following properties: For Zy,zEV 
anda € R, 


(i) (x,x) 2 0 and (z, x) = 0 if and only if z = 0, 

(ii) (x,y) = (v2); 

(iii) (z + 2,y) = (ty) + (2,9) and (z,y +z) = (z,y) + (z,2), 
(iv) (az, y) = @ (2,9). 


(V, (,)) is called an inner product'space. For brevity sake, we may say V 
is an inner product space without explicitly mentioning the inner product 


(,). 


Example 5.1.1 The dot product defined above on R” is an inner product. 


Convention. Unless specified otherwise the inner product on R" will be 
assumed to be the dot product. 


Example 5.1.2 For z,y € R’, where x = (21,22), y = (y1,42), 


(x,y) = yr(z1 + 222) + yo(2z) + 522) 


Scanned by CamScanner 


a ae 


90 Inner Product Spacos 


defines an inner product. We shall prove that if (2,2) = 0 then x = 9), 
We have 


2 + 42,22 + 52? 
ri + 42,22 + 4x? +23 
(ay + 222)? + 2}. 


(z, 2) 


Thus (z,z) = 0 if and only if (x; + 22)? = 0 and 23 = 0. This ig true 
if and only if z, = 0 = z, that is, if and only if = 0. The rest of th 
verifications are left as an easy exercise to the reader. : 


Exercise 5.1.1 Let V =. Show that (z,w) := Re (zw) defines an inner 
product on C. 


Exercise 5.1.2 Let V = R? and define (x,y) := yi(221+22)+yo(2, 45 ) 
Show that this defines an inner product on R?. a 


Exercise 5.1.3 If f,g € C[0,1] define (f,g) = i f(t)g(t) dt. Note first 
of all that the integral exists (thanks to analysis!). Here also, the crucial 
thing to show is that (f, f) = 0 if and only if f = 0. This follows from 
Exercise 5.1.4 from Analysis. The rest of the properties follow from Well. 
known properties of the integral. Thus (C(0,1],(,)) is an inner product 
space. 


Exercise 5.1.4 Let f: [0,1] + R be continuous with f(t) > 0 for t € (0, I) 
Then i f(t) dt =0 if and only if f(t) = 0 for all t € [0,1]. Hint: To prove 
the nontrivial part, assume that i f =0. If f is not identically 0, let to be 
such that f(to) > 0. Let a := f(to) and € := a/2. For €/2, by continuity of 
f at to, there is a 6 such that f(t) € (€/2, %) for t € (to — 6, to + 6). Using 
various properties of the integral, we see that 


1 to+d totd é 
[ soaz[™ toar fo ja=et>o 
0 t a) 


o-6 


This contradicts our assumption that iM f(t)dt =0. 


Exercise 5.1.5 Let V = M(n,R). Define (A,B) := tr AB‘. Show that 
this defines an inner product on M(n,R) (tr(X) := )0; vy if X = (zj;)). 


To gain facility with computations involving inner products, do the next. 
exercise. Think of distribution of multiplication over addition. | 


Exercise 5.1.6 Let V be an inner product space. Then 
(ax + by, cv + dw) = ac (x, v) + ad (x, w) + be (y, v) + bd (y, w). 


What are (r+y,r+y), (ct+y,2-y)? 


Scanned by CamScanner 


Inner Product Spaces P 


Exercise 5.1.7 Fix a € V, an inner product space, Show that the maps 
from V to R given by z+ (z,a) and y+ (a,y) are linear. 


Exercise 5.1.8 Let {v,,-..,Un} be a (not necessarily the standard) basis 
of R". Let ai € R be given for 1 <i <n. Show that there exists a unique 
vector z € R” such that x- vj = a for all i. Hint: Think of a linear map! 


Definition 5.1.5 Let V be an inner product space. We can imitate the 
definition of length or norm defined as earlier. The length or norm of a 


vector v € V is ||v|| := (v,v)*, the positive square root of the non-negative 
number (v, v). 


If the reader feels uncomfortable with abstract inner product space, 
he may assume that the inner product space is R” with the dot product 
introduced above. However, he should notice that nowhere (except in some 
examples) we shall have to use the way the inner product is defined. We 
shall use only the defining properties of the inner product. 


Lemma 5.1.1 Let V be an inner product space. The norm function 
| Il: VR 
has the following properties: 
(1) ||x|| 2 0 and ||x|| =0 if and only ifz=0, forzey., 
(2) |laz\|| = la| ||x||,7€V andaeR. 


Furthermore, given a nonzero vector v € V, there is a vectoru€ V such 
that ||u|| =1 and v= \|u||u. This u is called the unit vector along v. 
Proof (1) is easy. We shall prove (2). 

It suffices to show that l|ax||? = Jal? \|x||*. 


Ilo ||? = (ax, ax) = a (r,az) =a (az,z) = a? (z, 2). 


(Can you justify the steps above?) To prove the last assertion, observe that 
the equation v = ||v|| u suggests that we take u = Tom: ¥ is of unit length 


Panga | 
ren] = gen lel = 


because of (2): ||ul| = | 
QO 


A vector u in an inner product space V is said to be of unit norm or unit 
length if ||u|| = 1, that is, if and only if (u,u) = 1. The above construction 
of a unit vector along a given nonzero vector is used quite often in the 


sequel. 


Scanned by CamScanner 


92 Inner Product Spaces 


Theorem 5.1.2 (Cauchy-Schwarz Inequality) Let V be an inne, 
product space. If a,y € V, then |(,y)| $ llr || |ly||. Further, equality 
holds if and only if one is a multiple of the other (that is, x and y are 
linearly dependent). 


Remark 5,1.1 In the case of R? with the dot product, Cauchy-Schwary 
inequality is obvious. For, since (z, y) = ||z||||y|| cos, we have 


I(z,y)] = | [lel Ilyl} cos@ | 
[Iz II lly] [cos 8] 
Iz Il Ilyll- 


Proof We shall give three proofs — one is elementary, geometric, The 
second uses calculus and the third uses results on quadratic equations. Algo 
we indicate a proof of Cauchy-Schwarz inequality in the special case of Rn 
with the dot product. The last one is to convince you how the level} of 
abstractions help us understand the underlying principles. 


IN IA 


Proof 1. If z = 0 or y = 0, then (z,y) = 0 and either (2,2) = 9 


(y,y) = 0. Hence the result. Now consider the case when ||x|| = ||y|| i 
Consider (x — y,z - y). Then 
0<(r-y,z-y) = (z,z) + (y,y) — 2 (2, y) 


2-2(z,y) as ||x|| = |ly|] =1. 


This gives (z,y) <1. 
Similarly (x + y,2 + y) > 0 yields -(z,y) < 1. Hence 


I(2,y)| $1 =||2||Ilyll- (5.1.1) 


We now prove the statement concerning the equality. Let |(x,y)| = y 
Then either (z,y) = 1 or -1. If (z,y) = 1, from the above chain of 
inequalities we deduce that (x -y,r-y) =Oorz=y. If (z,y) = -] — 
see that c = —y. Thus equality holds if and only if either z + y= 0 on 
z —y =0, that is, if and only if z = ty. 

Now suppose z and y are nonzero (not necessarily of unit length), Then 
u=—T and v = Tal are of unit length (by Lemma 5.1.1). By the Previous 
case |(u, v)| < 1. Therefore, 


(oni 


From this we get |(z,y)| < ||=|| lly: | 
If z and y are nonzero, then the equality means (z,y) = ||z IIllyl} 0 
—(z,y) = ||z]| |]y||. Assume the first happens. r 


1 1 
Il lly 


(a) <1. 


Scanned by CamScanner 


Product Spaces 


[Inner . 
Then 
(x, y) = Healnieal 
cy 
= ee ee 
ert) ; 
a oe 
Iz{| ~~ [lyf 
= easel, 
ly|| 
The other case is similar. 
O 


proof 2. Fix 2 and y in V. If y = 0, then the result is obviously true. So 
we assume that y # 0. Consider the real valued function of the real variable 
H(t) = (x + ty, 2+ ty). We want to investigate the extremum points of f: 


= (zr, r) + 2t i, y) as e (y, y) 2 


Thus f(t) is a polynomial in t with real coefficients. 
f'(t) = 2(z,y) + 2t (yy). 


Then if to is an extremum point for f only if f'(to) = 0, that is, only if 
(2, y) +to (y,y) = 0. This suggests that we choose tp = ~ ea, Is this point 
an extremum point? Now f"(t) = 2(y,y) > 0, since y + 0, for all t, in 
particular for t = to. Hence f(to) is a minimum. That is, 0 < f (to) < f(t) 
for all t. But f(to) > 0 since f(t) > 0 for all t. That is, 


(rn) 9m)? , (lew)? 


—__. 


(y, y) (y,y) =i 


It follows that 


(,y)" 


(x2) > TO ot Wavu)l < (ea)? (ya), 


The equality case is again dealt with by carefully retracing the above chain 
of inequalities. 


0 


Proof 3. Let p(t) := at? + bt+cbea quadratic polynomial in ¢ with real 
coefficients. Recall that p(t) is always nonnegative (or always nonpositive) 


Scanned by CamScanner 


04 Inner Product Spacey 
if and only if it has no real roots or a double real root. This happens jf 
and only if the discriminant 6? — dac < 0 or equal to 0, Now f(t) as in the 
second proof is a quadratic polynomial in ¢ with real cocflicients a = (y, y), 
b= 2(r,y) ande = (x, 2). Also, f(t) is always nonnegative. So we conclude 
that 6? - dac < 0. From this the required result follows. 


0 


Remark 5.1.2 Let us consider R" with the dot product. The Cauchy. 
Schwarz inequality in this case reads as follows: 


} } 
< (Es) (x ‘) , forall x,y; ER. (5.1.2) 


This concrete inequality is quite useful in analysis. Note that the inequality 
is a special case of what we proved above. Another more classical proof of 
this follows from the Lagrange’s identity 


(Sem) =(Le)(Ce)- wre 


Exercise 5.1.9 Let aj, 1 <i <n be positive real. Let a € R. Show that 
n 2 n n 
i=] i=l i=] 


with equality if and only if either a = 1/2 or a 4 1/2 but all the a;’s are 
equal. Hint: Let 2; := af? and yj := q(t-9)/? in Equation (5.1.2). 


Exercise 5.1.10 What does the Cauchy-Schwarz inequality mean for V 
in Exercise 5.1.5? 


Corollary 5.1.3 Let V be an inner product space. The norm function 
|| ||: V 4B has the following properties: 


(1) ||xz]| =0 if and only ifz =0, force V. 
(2) |Jaz|| =|a|||z|| forz € V anda eR. 


(3) |Jz+yl| < |lz|]+\lyl], forz,y € V. This is known as the Triangle 
Inequality. 


(4) | Ilzl]-Ilyll |< llz- yl] for all z,y eV. 


Scanned by CamScanner 


Innor P roduct Spaces: ae ees ian 


Proof (1) and (2) were proved ea-lier. To prove the triangle inequality, 
we compute 

(z + Y,r + y) 

(x, z) +2 (x,y) sa (y,y) 

lil!" + 2\lz\Illvll+llvll? by Cauchy-Schwarz 
(Ilz|| + lIyll)?. 

The triangle inequality follows. 

Observe ||x]| = ||(z — y) + yl] < llz - yl|+lly|| by triangle inequality. It 
follows that ||2||—|yl| < ||z — y||- Interchanging z and y in this inequality, 
we get 

Hull — ell < ly —2ll = IN(-1)(@- »)Il = I-A lz - yl = lz - pI. 


Thus, £({/x]] — {ly ll) < [lz — y||- Hence the last assertion follows. 


Y ae Sa 
Jat ll * 


IA stl 


0 
Definition 5.1.6 A metric on a set X is a function d: X x X + R with 
the following properties: 
(i) d(z,y) 2 0 for z,y € X and d(z,y) = 0 if and only if z = y. 
(ii) d(a, y) = d(y, x) for all z,y € X. 
(iii) d(x, z) < d(z,y) +d(y,2) for all z,y,z € X (Triangle inequality). 
Proposition 5.1.4 Let V be an inner product snace. If we define 
d(x,y) := ||e—y|| forz,y €V, 
then d is a metric on V. 


Proof We shall show that d satisfies the triangle inequality. As is to be 

expected, we use triangle inequality for the norm. Let z,y,z € V. Then 
d(x, z) I|z - 2]| 

I(t - y) + (y—2)|| 

l|z — yl] + lly - zl] = d(x, y) + d(y, 2). 


The rest of the proof is easy and left to the reader. 


lA 


a) 
Exercise 5.1.11 Show that “distance” is translation invariant. That is, 


d(x +2z,y+2)=d(z,y) for all z,y,z€V. 
In the notation of Exercise 4.1.2, this says: 


d(T:(x), T:(y)) = d(x, y) for all z,y,z € V. 


Scanned by CamScanner 


96 Inner Product Spaces 


9.2 Orthogonality 


Unless specified otherwise, V will stand for an inner product space. 

In Section 5.1 we saw how the existence of an inner product on 4 
vector space induces notions such as the length of a vector and the 
distance between two vectors. In this section, we shall see how to define 
the angle between two nonzero vectors of an inner product space. 

The remarks made after Definition 5.1.1, motivate the following 
definition of the angle between two nonzero vectors x and y. 


Definition 5.2.1 If z and y are two nonzero vectors in an inner produc, 
space V. then by Cauchy-Schwerz inequality, we have 


(z,4 
~1¢ <1 
i z | | } y| 
tom trigonometry (or more rigorously, from analysis), it follows that there 


‘ . bi a a {z.9) als 
exists 2 unique § € (0. 7] such that cos@ = =%. This 0 is called the angle 


between the nonzero vectors z and y. 
/ Exercise 5.2.1 Compute the angle between 


(1) w=e; and w =4;+e, in R’, 


ll (2) v =(z.y) end (~y.z) z#O0F yin B, and 
(3) (z3,--- -Zae-5, Fan) 20d (—22,Z1,—-Z4,Z31--- s Zoe, Z2e-1) in Re, 


and oft) = ¢t? in Exescise 5.13. 


Exercise 5.2.3 Le: V te 28 in Exercise 5.1.3. Let f(t) =t. Let h(t) = #2. 
Compa g = h- 3h. f, {. What is the angle between f and 9g? (If you 
. ace intrigued try this, the mystery behind this construction will be solved 


in 2 later ection.) 


Definition 5.2.2 Le z and y te vectors in an inner product space V. We 


say x and y ae orthigodl if 'z,y, = 9. This definition is meaningful since 
canis (2) = 0. Als, it coincides with what we have in R? with the dot 
pia: (2.4, = 0 implies 004 = 0 which implies 6 = 5. We writez Ly 


ww COLE Ca ji = 0, 


Exercise 5.24 1.7.y,2¢V. lAzrlyaendziz. Thene (ay + Bz) 
bon all 0,8 € & Mone genmally, them [ve V | (2,0) = 0} is a vector 
sutpate AV. Nt in dened ty z~. 


Scanned by CamScanner 


lice A ee ee eR, 


Assume that v = (a, 8) #0 in R?. What is the geometric description of 
vt? vt is given by {(z,y)€R? | az+fy= 0}. Thus v~ is a straight 
line passing through the origin perpendicular to the vector (a, §). 
Proceeding similarly, we see that if v = (a,2,7) € R is nonzero, then 


yt = {(z,y,z)€R | az + By+7z=0}. 


Thus u~ is the plane through the origin with normal v = (a, 9,7). 


Example 5.2.1 This generalizes the last observation. We wish to find the 
dimension of W={veV | (u,z) =0} for a fixed nonzero ze V. 
Define Ta : V — R by Ta(v) = (z,a). Then Ty, is a linear map 
(Exercise 5.1.7) and W = ker(T,). Since a + 0, Im(T,) = 2 For, 
T,(a) = (a,a) # 0 lies in the vector subspace Im(T,) C R Therefore 
dim Im (Tz) > 1 and hence equals 1. Now, by Equation (4.3.1), we have 


dim V = dim ker(T,) + dim Im (T) 
which implies that dim W = dimV -1. If a=0, then W =V. 


Exercise 5.2.5 Let v and w be nonzero vectors in V with v L w. Let 
a, 8 € R be such that avu+ fw =0. Then a =0 = 8. That is, two nonzero 
vectors orthogonal to each other are linearly independent. 


Exercise 5.2.6 If 2 vector z € V is orthogonal to all the vectors in V, 
then z = 0. Consequently, if (z,v) = (y,v) for allu€ V, then r= y- 


This simple exercise is quite often used to show the equality of two 
vectors in an inner product space. 


Exercise 5.2.7 Let W;, i = 1,2, be vector subspaces of V. Assume thet 
each vector in one of them is orthogonal to all of the other. Show thet 
W, OW = {0}. 


Exercise 5.2.8 Let S be any nonempty subset of V. Let 
S*:={veV | (v,s)=0, forall s€ S}. 


Then $+ is a vector subspace. Can you think of two proofs — one direct 
and the other using Exercise 2.2.27? 


Exercise 5.2.9 Let v = (a,) € R? be nonzero. Describe v+ as Rw fore 
suitable w. 


Exercise 5.2.10 Let v and w be two nonzero vectors in R?. Assume that 
the set of vectors orthogonal to both of them is 2 plane (through the origin). 
Then each is a scalar multiple of the other. (Do you see this geometrically?) 


Scanned by CamScanner 


98 Inner Product Spaces 
Eee td 


Exercise 5.2.11 Let v = (a, 8,7) be a nonzero vector in BS. Finda basis 
of W := y+. Hint: W is described by a linear equation and you have learn, 
Section 1.1! 


Exercise 5.2.12 This is a continuation of Exercise 5.2.11. Give a pair of 
equations whose solution set is the line joining the origin and v. 


Lemma 5.2.1 (Pythagoras Theorem) Let 1,y € V be orthogonal to 
each other. Then |x + y|{* = ||z]|° + llyll’. 

More generally, let {v;}¥_, be a set of vectors such that they are pairwise 
orthogonal, that ts, u, Lu; ifi#j. Then 


k IP ok 

2 
Soul] = do llall 
i=l t=] 


Proof Do you see why this is called the Pythagoras theorem? 
A simple computation yields the result: 


k k k 
(3: 3a) 
t=] =1 j=l 

= yy (vi, v;) 


1Si,jSk 
k 
= S (ui ti) as (vj,v;) =0 for i # j. 


i=l 


2 


Did you notice how the sums were indexed at the top right? 


0 


Exercise 5.2.13 ||z + y|| = ||z]|+|]y|| if and only if one is a nonnegative 
scalar multiple of the other. 


Exercise 5.2.14 For any z,y € V, we have 
2 
4(z,y) =|le+yll -lle- yl’. 


This is known as polarization tdentity. 


Exercise 5.2.15 Prove that for any two vectors z,y € V, 
2 2 2 
let ull + lle - vl? = 22 Il’ + IlylP). 


Geometrically this means that the sum of the squares of the diagonals 
equals the sum of the squares of the sides of a parallelogram. 


Scanned by CamScanner 


Some Geometric Applications 99 
i a aa 


Exercise 5.2.16 Prove that ll|| = |||] ifand only ifz—y L2+y. (The 
geometric meaning of this is that a parallelogram is a rhombus if and only 
if the diagonals are perpendicular.) 


Pxercise 5.2.17 Prove that z and y are orthogonal if and only if 
2 
Iz - wll” = IIz|)? + lly. 


(This is Pythagoras theorem and its converse.) 


5.3 Some Geometric Applications 


We mentioned at the beginning of this chapter that R? with dot product 
is a model for Euclidean geometry. In this section, we give indications 
for this assertion by proving some results from geoinetry using the notions 
developed so far. This also serves to instill a geometric way of looking at 
linear algebra. To start with, let us examine the last few exercises at the 


end of Section 5.2. 


Example 5.3.1 The sum of the squares of the diagonals of a parallelogram 
equals the sum of the squares of its sides. 


To prove this we need to turn this geometric problem into the language 
of linear algebra. Without loss of generality, let us assume that three of 
the vertices are the vectors 0, z and y. Draw a picture. Then the vertex of 
the fourth side is r + y. (Recall the geometric interpretation of the vector 
addition.) The length of the side whose end points are 0 and z (respectively 
y) is ||7|| (respectively ||y||). So we know the lengths of sides, One of 
the diagonals has its endpoints 0 and z + y and so its length is |z + y|]. 
The other diagonal has endpoints x and y and so its length is |z — yl]. 
Therefore, what we are asked to show is that 


2 
2(\[x[|? + ly ll?) = le +-yll? +|l2— yl). 


2 
This is easy: We write || x + yl’ = (t + y,2 + y) and expand the right side. 
We do similarly for ||z — y||". Add the results to get what we want. 


(a+ y,0 + y) 
(c-y,z-y) = (z,2)-2(z,y) + (yyy). 


ll 
a 
os 
a 
~~" 
-f 
te 
< 
= 


The sum of the right sides in these equations is 2((z, 2) + (y, y)). 


We shall be brief in the rest of this section. 


Scanned by CamScanner 


100 Inner Product Spaceg 
ie 


Example 5.3.2 A parallelogram is a rhombus if and only if the diagonals 
are perpendicular to each other. 


We use the notation as in Example 5.3.1. The direction vector of the 
diagonal joining 0 and z + y may be taken as z + y. The direction vector 
of the diagonal joining the points z and y may be taken as x — y. (Recal] 
direction vectors of é(z, y) are of the form t(z—y) fort € R.) The diagonals 
are perpendicular if and only if their cirection vectors are orthogonal, that 
is, if and only if (x — y,z + y) = 0. Thus, we are asked to show that 


\|z|| = ||y|| if and only if (r+ y,2 — y) =0. 


We invite the reader to verify this. 


Example 5.3.3 A parallelogram is a rectangle if and only if the diagonals 
are of equal length. 


With the notation as above, what we are supposed to show is that 
l|z+y|| = |lz—y]] if and only if the angle between the sides Oz (the 
line segment joining 0 and z) and Oy (the line segment joining 0 and y) jg 
7/2, that is, if and only if cosine of this angle is zero. This is translated jn 
our language as ||z + y|| = ||2 — y|| if and only if (z,y) = 0. In an inner 
product space, it is easier to work with the inner product than the norm. 
So what we would like to establish is 


(r+y,r+y) =(z-y,z—y) ifand only if (z,y) =0. 


Now this is an easy exercise for the reader. 


We now turn our attention to the study of triangles. The first example 
is the Pythagoras theorem. 


Example 5.3.4 A triangle is right angled if and only if there exists one 
side whose square equals the sum of the squares of the other two sides. 


How do we find a model for this in linear algebra? Recall that a triangle is 
a triple of three non-collinear points, that is, they do not lie on a line. These 
three points are considered as the vertices of the triangle. To find a model 
for this, we may assume that the vertices are at 0, z and y. The condition 
for their non-collinearity turns out to be their linear independence. Do you 
see this? The line joining 0 and z (respectively y) is Rr (respectively Ry). 
The point z (respectively y) lies on Ry (respectively Rr) if and only if there 
are a,b € R such that az = by, that is, if and only if z and y are linearly 
independent. Thus a triangle with a vertex at the origin corresponds to a 
triple (0,2, y) of points with {z,y} linearly independent. The lengths of 
the sides of the triangle are, therefore, ||z||, ||y|| and ||z — y||. (Refer to 
Example 5.3.1.) 


Scanned by CamScanner 


To simplify the matters, we make a further assumption. If the given 
triangle has the Pythagorean property c? = a? + §?, we may assume that 
the vertex opposite - the longest side c is at 0. Thus what we are supposed 
to prove is |x — y||" = Iz|" + |ly||? if and only if (cy) =0, As usual, the 
reader proves this. 


Example 5.3.5 Jf a triangle is isosceles, then the medians to the two sides 
of equal length are of equal length. 


We keep the notation of Example 5.3.4. We assume that the sides Oz 
and Oy are of equal length. This means that ||z|| = ||y||. The midpoints 
of the line sides are ir and } y. The medians, under consideration, are the 
line segments joining (i) x and y/2 and (ii) y and 2/2. Their lengths are 
I|z - y/2l| and ||y — z/2||. What we have to prove is: ||z|] = ||y|| implies 
\|z — y/2ll = lly — z/2||. We ask the reader to check this. (Remember: It 
is always better to use inner product than the norm.) 


Exercise 5.3.1 Prove the converse of Example 5.3.5. 


We now look at a result which belongs to Affine Geometry, that part 
of geometry which deals with points, lines, planes and their incidence 
(inclusion) relations and which does not deal with metric concepts such 
as lengths and angles. First a definition: Given a line segment 


[r,y] = {tz +(1-t)y | 0<t<}} 


in a vector space, the point z := (x +y)/2 is called the midpoint of the line 
segment. (If V happens to be an inner product space, then this coincides 
with our metric requirement: z divides the line segment into two parts of 
equal length. The point here is that we can define the midpoint of a line 
segment in any vector space.) The result that we want to prove is taken uy 

in Example 5.3.6. . 


Example 5.3.6 The medians of a triangle are concurrent. 


As earlier, we take the vertices at 0, 2 and y with {z,y} linearly 
independent. The midpoints are (z+ y)/2, 2/2 and y/2. The lines joining 
the vertices with the midpoints of the opposite sides are: r. (z + y)/ 2 
stt+(1- s)(y/2) and ty + (1 —t)(x/2). Let us find the point of intersectio 
of the last two lines. Finding the point of intersection is equivalent 
solving for s and ¢ in the equation: sz + (1—-s)(y/2) = ty+(1- t)(2/2), 
This is rewritten as 


1-t fe l-s 
s 9 a y=0. 


Scanned by CamScanner 


102 Inner Product Spaces 
OO —— 


Ne EE OS 
Since z and y are linearly independent, we deduce that s = a and t = ts 
It follows that s = ¢ = 1/3. Thus the point of intersection is (z + ay 
This point certainly lies on the first line. Thus the lines are concurrent ‘< 
(z+ y)/3. 


Exercise 5.3.2 Redo Example 5.3.6 without assuming that one of the 
vertices is at the origin 0. This will give you more symmetric expressions 


The next result that we want to prove is about circles: 


Example 5.3.7 The angle inscribed by semicircle ts a right angle. 


y 
- 


We assume that the circle has centre at 0 and radius r in R?, The 
diametrically opposite points are given by z and —z for some z with 
lz|| = r. Let y be any point on the circle. Then ||y|| = r. The angle 
inscribed is the angle between the lines {(—z,y) and ¢(z,y) at y. Their 
direction vectors are z + y and z — y. Thus, we are expected to show that 
||z|| = |ly|| if and only if (z + y,z — y) =0. You should have no difficulty 
in proving this result. : 


In all the above computations, you might have noticed that whenever 
we needed to deal with two vectors, we assumed that we were in the two. 
dimensional space spanned by them. This assumption allows us to see 
what happens in R? and get geometric insight. This point is worth remem. 
bering. 

With this we end our excursion into geometry. 


5.4 Orthogonal Projection onto a Line 


This forms the heart of the next few sections. 

Let v € R” be any vector. Let u be any unit vector. that is, ||u|| = 1. We 
want to find whether there is a vector up in the set Ru which is nearest to v, 
That is, we are looking for a real number to such that d(v, tou) < d(v,tu) 
for all t € R. We can solve this problem in two ways. One is geometric and 
the other is analytic in the sense that it uses one variable calculus, 

Look at Figure 5.4.1. Let @ be the angle between u and v. The vector 
P,,(v) is called the orthogonal projection of v on u. From Figure 5.4.1, we 
see that 

(u,v) 


P,(v) = ||v|| cos Ou = |u|] —->— -u = (u,v) 


Ilul| lll 


since ||u|] = 1. Hence we have P,(v) = (u,v) u. This suggests the following 
definition. 


Scanned by CamScanner 


rthogonal Projection onto a Line 


O 103 


u Py (v) 


Figure 5.4.1 Orthogonal projection. 


Definition 5.4.1 Let V be an inner product space. Let u be a unit vector 
and v € V arbitrary. Then the projection P,(v) of v onto the line (one- 
dimensional subspace) Ru is defined by P,(v) := (u,v) u. 


From Figure 5.4.1 it is clear that P,(v) is the point on Ru closest to v. 
The following proposition asserts this. 


Proposition 5.4.1 For a unit vector u and anyv €V, we let 
Py(v) := (v,u) u. 

Then d(Pu(v),v) < d(au,v) for anya eR. 
Proof First observe that (vu — P,(v)) Lu: 
(v — Py(u),u) = (v, u) — ((v, u) u, u) = (v,u) (u,v) (u,u) = (u,v) — (u,v), 
since (u,u) = 1. Hence (v — P,(v)) L au for all a € R and therefore 

(v — Py(v))  (Pu(v) — u) for alla eR. 
Use Lemma 5.2.1 to get 
|v - aulf? = ||v — Py(v) + Pu(v) - aull? = [lv — Pu(v)||?+|] Pu(v) - au? 


Thus, we get d(v,au)* > d(v,P,(v))? and equality holds if and only if 
au = P,(v). 

QO 

This is how proofs are written. Do you see why we thought of this proof? 

Usually, we start with the geometric idea and turn it into a rigorous proof. 

Here also that is what happened. A close look at Figure 5.4.2 tells us that 

we have a right angled triangle whose “vertices” are at au, P,,(v) and v. 


Scanned by CamScanner 


>) 
104 Inner Product Space, 


au sou P,(v) 0 Pu(v) au 


Figure 5.4.2 Projection is the closest approximation. 


Naturally, the hypotenuse will be longer. For once, we tried to prove the 
result in a formal way without this geometric motivation. Hopefully, you 
understand the proof much better now and appreciate our efforts to put 
things in a geometric language! 

Now we prove this result using calculus. The first impulse would be to 
consider the function g(t) := ||v — tu|| and try to find its extreme values. A 
good analyst would not do this. For, the norm function is quite akin to the 
modulus function | | on R which is not differentiable at the origin. Also, 
inner products are easier to deal with than norms. If you doubt me, go 
through the proof below. 

We consider the function f(t) = (v-tu,v—tu). A point to is a mini 
mum of f if and only if it isa minimum of g. But f, as earlier, is a quadratic 
polynomial in t: f(t) = (u,u) t? —2(u,v)t+(v,v). If to is an extremum 
point, then f’(to) =0. We find that 


f'(t) = 2t (u,u) - 2 (u,v). 


So tp = 2. Also, f"(t) = 2(u,u) > Oifu 40. Thus f attains a 


(u,u)” 
minimum at tg. Therefore, the vector eu is closest to v among all the 
vectors tu of Ru. 
This motivates the following definition: 


Definition 5.4.2 Let u be a unit vector and let v be an arbitrary vector. 
We define the orthogonal projection of v along u by P,(v) = (u,v) u. Ifu 


Scanned by CamScanner 


Orthonormal Basis 
ig any nonzero vector then 
u 
P,(v) = (aay) _B ~ (ts) 


Exercise 5.4. aii that v—- Py(v) 1 u, for any ujv € V (This i 
already seen and inserted here for future reference.) yuuve V. is is 


5.5 Orthonormal Basis 


Given any finite dimensional vector space we know that there exists a basis 
But if the finite dimensional vector space has additional a. : c} ~ 
an inner product then we may look for a basis {e),... en} wh sa as 
some additional properties involving the inner product. For ae 
standard basis {e;} of R” has the following properties involyin <. 
product: ne 


(1) Each of them is of unit norm. 


(2) They are all mutually orthogonal to each other. 


These properties can be succinctly put as a single condition wi 
of the Kronecker delta 6j;: 6,; is defined as with the use 


Thus the standard basis {e;} of R” has the property 
; y that (e;,e;) = 4;; 
1<i,j <n. This suggests the following definition: (¢is¢4) §;, for 


Definition 5.5.1 A basis {v;,...,Un} of V is said to b 
we have (v;,vj) = 6i; for 1 <t,7 <n. e orthonormal if 


To emphasize the geometric aspect of this definition, le 

P : : ; let us ref 

this. A basis {v;} of V is orthonormal if and only if each 1; is of ead. 
and they are mutually orthogonal to each other. ngt 


Example 5.5.1 The standard basis of R" is an orthonormal basis 


Exarap! 5.5.2 The basis {(e1 +e2)/v/2,(e —e2)/V/2} is an orthonormal 
basis of R?. Can you now construct an orthonormal basis of R" which j 
not the standard basis? 2 TS 


Exercise 5.5.1 Is the set {Fj;} of M(n,R) an orthonormal basi 
Exercise 5.1.5)? al basis (see 


Scanned by CamScanner 


106 Inner Product Spaces 


What is the use of an orthonormal basis? Let {vj} be a basis of V. Let 
v € V be given. By the very definition of basis, we know that there exist 
scalars a, € R such that v= Y°; aju,. In the case of an arbitrary basis, we 
have no clue to these scalars. But in the case of an orthonormal basis we 
know what they are! Let us assume that {v,} is an orthonormal basis of 
V. Write v = ¥ a,uj. Let us take the inner product of both sides with 
the vector vj. Using the orthonormal properties of the orthonormal basis, 
we get 


(uv, vj) = (r ont) = (0 Uj, Uj) )= Dail Uj, Vj) = = S ai6i; = Qj. 


Thus a; is (v, vj), a quantity which involves the given vector v, the jth basis 
vector and the inner product. Even though this is simple, to emphasize its 
practical importance, we elevate it to the status of a theorem. 


Theorem 5.5.1 Let {v;}"L, be an orthonormal basis of an inner product 
space V. If v= Yo", aivj, then the ith coefficient a; = (v, vj). 
0 


Exercise 5.5.2 Let {vi} be an orthonormal basis. Let « = >: 2,v; and 
y = DL, ivi. Show the following: 


(1) |e? =C%, (wad)? (2) (ey) = Dy zim. 


Exercise 5.5.3 This is the converse of Exercise 5.5.2. Let {v;} be a basis 


of V such that if v= )>, aj; then ilv||? = 5, a7. Prove that {u:} is-ah 
orthonormal basis. 


Another. use of an orthonormal basis is the following: Assume V and W 
are inner product spaces with orthonormal bases {v;}!, and {w;}" Fay Let 
A: V 3 W be any linear transformation. Then the general theory tells 
us that Avj = j= Yj for some scalars aj; € R. Since {w;}"_ is an 
orthonormal basis of W’, we can find aj; explicitly as eariier: 


(Avj, we) = (aves) = ya (wj, Wy) = S- aijdjx = Qin. 
) j 


Thus with respect to the bases {vj}/2, and {w;}"_, the matrix for A is 


(Avy,wy) 6. (Adm, wy) 
MAlA)=.| ed 43° (5.5.2) 


(Auj, Wn)... (Adm, Wp) 


Scanned by CamScanner 


Orthonormal Basis 107 


Perhaps one should record this as another theorem because of its 
usefulness but we resist the temptation. 


Does every inner product space have an orthonormal basis? Before we 
answer the question, we ask for a basis with a less stringent condition. 


Definition 5.5.2 A set EC V is said to be orthogonal if 
(i) 0¢ E, and 
(ii) (x,y) =0 forallz,yeErHy. 


An orthogonal basis is a basis which is also an orthogonal set. 


Exercise 5.5.4 The following sets are orthogonal: 
(1) Any orthonormal basis in V, 
(2) {c+y,2- y} in R? with (z,y) #0, and 
(3) {(a,0,0, 0), (0, b, 0,0), (0,0, c,0)} in R* with abc ¥ 0. 
Which of these are orthogonal bases? 
The next lemma generalizes Exercise 5.2.5. 


Lemma 5.5.2 Any orthogonal set in an inner product space V is lin early 
independent. 


Proof Let E = {v,,..., Uk} be an orthogonal set (of nonzero vectors). 
Assume that Se ajvj = 0 fora; € R. Then ol aii, ¥j) = 0 for 
1<jsn and hence 1 % (vj, vj) = 0. Since (vj, v;) = 0 for i # j, the 


only surviving term is a; (vj, v;). Since v; € E, and 0 ¢ E, (v;,v;) #0. 
This implies a; = 0. As j was arbitrary, we see that a; = 0 for all j. 


0 
Exercise 5.5.5 Suppose {v,... , Un} is an orthogonal basis. Then 
fe= 7 | I<icn} 
Il vil 
is an orthonormal basis. 
Exercise 5.5.6 If{u,..-, Un} is an orthogonal basis of V and» = yen 


U, Vi) ; 
then ag = 71 StS 


Scanned by CamScanner 


We now show that (R?, (,)) has an orthonormal basis. By Exercise 5, 
it is enough to produce an orthogonal basis. Let {v, v2} be a basis of R 
Let uy := vy. The idea is we want a nonzero vector up which is orthogonal 
to v;. Our earlier study of orthogonal projection of a vector onto a nonzey 
vector suggests a candidate, namely : 


(uy, v2) 


'= ty - = —-P,, (v2). 
Up = U2 ram 2 — Py, (v2) 


(See Exercise 5.4.1.) Let uz = v2 - P,, (v2). For completeness sake, we 
shall show that uy 1 uo: 


(uy, U2) = (uy, 02 - Pu,(v2)) 


= (uy, v2) = (u1, Pu, (v2)) 
= (una (mn) 
= (4,02) - aa (v0) 


= 0, 


/ Further, ua #0. For, if up =0, then v2 = P,,(v2) is a scalar multiple of y 
| But this contradicts the assumption that {v,, v2} is a basis. Hence {uss} 
is an orthogonal basis. 
Imitating this argument, we prove the following theorem: 
Theorem 5.5.3 Let V be any inner product space. Then V has gn 
orthonormal basis. 


Proof By Exercise 5.5.5, it is enough to produce an orthogonal basis of 


Vi 
Let {vj}".1 be a basis of V. Let uy = 14. Asin the case of R? above, we 
define ‘ 
(v2, 01) 
U2 = 02 - v 
(¥), 0) 


Then up L wy (that is, vy L v2 - Pa,(v2)). Therefore (uj,u2) = 0. Also 


up #0. For, if uy = 0, then v2 = is mu will imply {v,,v2} and hence 


the basis {vj}? is linearly dependent, a contradiction. Let 


U2,U 


(v3, uy) (v3, up) 


(ujju;) | (u2,u2) ° 


ug = U3 - 
Then (u3,u1) = 0 and (u3,U2) = 0 and u3 # 0. For otherwise, v3 is a 


linear combination of uy and uz and hence a linear combination of Uv) and 
vy. This implies {v1, v2, vs} is linearly dependent, a contradiction, 


Scanned by CamScanner 


5 ee 


Figure 5.5.1 Gram-Schmidt process. 


Proceeding as above by induction, define 


k-1 ( 


up = Uy UR, Uj 
> Va ) . 
(uj, Uj 


i=] 


~~ 


Uj. 


—— 


Then (ug, ui) = 0 for alll < i <k—1and as before, u, #0. We have thus 
produced an orthogonal basis {u,... , un} of V. Then {e; = = rn} sak 
orthonormal basis of V. 

a) 


The above process of obtaining an orthogonal basis is known as the 
Gram-Schmidt orthogonalization process. 


Exercise 5.5.7 Do you understand Exercise 5.2.3 now? 


Exercise 5.5.8 Show that the Gram-Schmidt process does not disturb the 
initial r vectors if they already form an orthonormal set. That is, in the 
given basis {v;}"_,, if {u,... , vr} is such that (vj, vj) = 4); for1 <i,j<r 
and if we apply the Gram-Schmidt process to get the orthonormal basis 
{e;}7.,, then e, =v; for 1 <i<r. 

In particular, the Gram-Schmidt process when applied to an orthonormal 
basis returns it intact. 


Exercise 5.5.9 Apply Gram-Schmidt process to obtain an orthonormal 
set: 


(1) {(-1,0,1), (1,—-1,0), (0,0, 1)} in R°. 


Scanned by CamScanner 


110 Inner Product Spaceg 
a 


(2) {1,p,(t) = t,po(t) = t?} of Pp with the inner product 


(3) {(1,1,1,1), (0,2, 0,2), (-1,1,3, -1)} in Rf. 
(4) {(1,-1,1,-1), (51,1, 1), (2,3,4,-1)} in Rt. 


Exercise 5.5.10 We can give another proof of the existence of an 
orthonormal basis by induction. If V is a one-dimensional vector space, 
choose any nonzero vector v € V. Then {u := v/||v||} is an orthonormal 
v basis of V. Assume the induction hypothesis that any inner product 
3 space of dimension less than or equal to n — 1 has an orthonormal basis. 
Let V be an n-dimensional inner product space. Choose any nonzero 
vector v € V. Consider W := v*. Then W is an inner product space 
of dimension n — 1 (see Example 5.2.1). Thus, by induction, W has an 
orthonormal basis, say {wj,... ,Wn-1}. Let up, be the unit vector along v, 
Then {wy,... ,Wn-1,Un} is an orthonormal basis of V. 
Note that this proof is not constructive while Gram-Schmidt process 
gives us an algorithm to find an orthonormal basis. 


5.6 Orthogonal Complements and 
Projections 


Definition 5.6.1 Let W Cc V. Define 
W+={zreV | (z,w) =0 forall we W}. 
W-= is called the orthogonal complement of W. 


Exercise 5.6.1 W* is a vector subspace of V. (Exercise 5.2.87) 


Theorem 5.6.1 Let W be a vector subspace of an inner product space V. 
Then V =W@W?. That is, any z € V is of the formz=w+w’', with 
w €W andw! €W?. Furthermore, this decomposition is unique. 


Proof Choose an orthonormal basis {w,... wr} of W. Let re V. Let 


Scanned by CamScanner 


Orthogonal Complements and Projections 111 


us define w = Soyer (t, wi) wi EW. Let w’ = 2-w, Then w! EW as 


(w’,we) = (x —w, wy) 


= (z, 7) ~ (w, wy) 


= (2,w,) — ( fe) 


i=l 


= (x, wk) — D(z, uj) (wi, wz) 


= (2,ws) - (2,un) 
= 0. 


Thus w’ is orthogonal to all w, and hence to the vector subspace spanned 
by them, that is, to W. We can thus write z = w+ wy’ as required. 
Now if c = wi + w}, w) € W, and w € W+, we then have 


/ / 
wtw-w-w=0 oO ww =wv'-wI. 


The left hand side of this equation is in W and the right hand side is in 
W+. Soifz=w-w = w’ — w}, then z € WOW. From Exercise 5:27 
it follows that z = 0, that is, w = w, and w' = w}. In case, you have not 
solved Exercise 5.2.7, the solution follows: Since z € W+, (z,y) = 0 for all 
y € W. In particular, taking y = z, we get (z,z) = 0. Hence z = 0. That is 
w—w, =0=w' — w; or w= w, and w' = w}. Thus the “decomposition” 
r=w+w’ is unique. 


0 


Definition 5.6.2 The decomposition of Theorem 5.6.1 is called the 
orthogonal decomposition of V with respect to the subspace W. The 
expression z = w + w’ in the theorem is called the orthogonal 
decomposition of the vector z with respect to W. The inner product space 
is said to be an orthogonal direct sum of W and W2. 


Exercise 5.6.2 When do you say V is the orthogonal direct sum of Wj, 
1<i¢hk? 


Exercise 5.6.3 Let W be a vector subspace of V. What is (W+)+? 


- 


The next definition generalizes Definition 5.4.2: 


Scanned by CamScanner 


112 Inner Product Spaces 


Definition 5.6.3 Let W C V be a vector subspace of an inner product 
space V. Then the orthogonal projection Py of V onto W is the map 
P(r) = w where z = w+ w’ is the orthogonal decomposition of 2, 

During the course of the proof of Theorem 5.6.1, we have derived ay 
expression of Py in terms of an orthonormal basis of W. If {w,,... 1 W,} 
is an orthonormal basis of W, then Pyy(v) := Yo}, (v, wy) wy. 


Lemma 5.6.2 Let W be a vector subspace of V. Let {wi}f_, be an 
orthonormal basis of W. Let {uj}$_, be an orthonormal basis of W+, 
Then {wi} U {uj} is an orthonormal basis of V. 


Proof The set {w;}U {uj} is certainly orthonormal. For, any pair is one 
of the forms: (w,, wg), (uj, wu), (wi, uj). The first two have inner products 
Jjx and 4; respectively while the inner product of the third pair is 0. Hence, 
they are linearly independent. Also, they span V. For, given v € V, by the 
orthogonal decomposition theorem, we can write v= w+w' with w € W 
and w’ € W+. But w (respectively w’) is a linear combination of w;’s 
(respectively u;s). (Why?) Hence v is a linear combination of {w,}U {u;}. 
Thus they form a basis. 

a) 


What is the geometric interpretation of Py? If vu € V, then P,,(v) € W 
is the unique element of W which is nearest to v: ||v - w|| > |v — Py(v)|| 
for all w € W. In terms of the distance function d, we have 


d(v, Py(v)) < d(v, w) 


for any w € W. Let w € W be an arbitrary element. We denote Pyy(v) by 
z. Then 


2 ' 
Jv - wf? = |ju-z+2-w|?=|Jv- 2)? +|]z-wl? (6.61) 


since v— 2 L W. Thus for all w € W, |v - w||? > |lvu- z||° and equality 
holds if and only if {2 - wl] = 0, that is, w = Py(v). Further, the 
distance d(v, W) := infwew d(v, w) is d(v, Py(v)) = || v - Py(v)||. 


Proposition 5.6.3 Any subspace W of R” is the set of solutions (that is, 
a solution space) of a homogeneous system of linear equations. 


Proof Let {w;,...,w,} (respectively {v,...,us}) be an orthonormal 


basis of Ww (respectively Ww"). Then {wy, vee y Wey Uy,... Us} is an 
orthonormal basis of R" by Lemma 5.6.2. Now z € R” lies in W if and 
only if 


(t,v;)=0, 1<i<s. 


Scanned by CamScanner 


Orthogonal Complements and Projections 118 


This is a homogeneous system of linear equations. For, if we write 


n n 
y= So izes and z= Saye; 
j=! j=l 


with respect to the standard basis, then x € W if and only if z is a solution 
of the homogeneous system 


n 
| aijrj =0, 1<i¢r. 
- 


0 


The rest of the subsection may be omitted in the first reading. 


Definition 5.6.4 Let V be an arbitrary vector space. V need not be an 
inner product space. A linear map P: V -> V is said to be idempotent if 


PF? =P. 


Exercise 5.6.4 Show that the orthogonal projection Py with respect to 
a vector subspace W is idempotent. 


Figure 5.6.1 Orthogonal projection, 


Scanned by CamScanner 


1l4 Inner Product Spaces 
re nT PCs 


Exercise 5.6.5 Let V be an arbitrary vector space. Let Wj be vector 
subspaces such that V = W, ® We (see Definition 2.3.8). If v = w, + wy 
is given, define P(v) = w; for i = 1,2. Then P; is idempotent, P, is 
called the projection of V onto W; with respect to the given direct sum 
decomposition V = W, @W. (P; depends on the factors W; as it jg 
possible to have V = W, @ W. and V = Wy, OW with Wy 4 We. The 
corresponding projections P; and Py will then be different. Find an example 
of this phenomenon in R?.) 


Definition 5.6.5 IfV is an inner product space, a linear map T: V > YV jg 
said to be symmetric (with respect to the given inner product) jf 
(Tx, y) = (z,Ty) for all z,y € V. 


Exercise 5.6.6 Check that the orthogonal projection Py is symmetric, 


Exercise 5.6.7 Show that an orthogonal projection is a projection and 
that a projection is an orthogonal projection if and only if it is symmetric, 


5.7 Linear Functionals and Hyperplanes 


V stands for an inner product space with dim V = n. 

We have already seen that for any fixed a € V, the map fq: t+ (z,a) 
is a linear functional on V. The following theorem says these are the only 
ones. 


Theorem 5.7.1 (Riesz Representation Theorem) Given a linear form 
f:V AR, there exists a unique y € V such that f(x) = (x,y) for all 
rev. 


r Proof Suppose there exists y € V such that f(z) = (z,y) for all ze Y, 
Choose an orthonormal basis {¢;}7., of V. Then, y= >", aje; for some 
a; € R. Now, f € L(V,R) and f is completely determined if we know f (ei) 
for 1 <i <n. Now f(ei) = (ey) = a for <i <n. This suggests that 
we take y = > f(e,)ei. It is easy to check that f(z) = (x,y) for all z € Y, 
For, ifr = )> aej, then 


f(z) =}° aif(ci). (5.7.1) 


Scanned by CamScanner 


ionals and Hyperp] 
Linear Functionals yperplanes ae 


W 


Figure 5.7.1 Riesz representation theorem. 


au 


Also, 


~~ 


= (2,5 s(es)ei) 
as (Saje;, 7 (er) )ei) 
= 2 1l«) Ja; (e;,e;) 


= ye) (ej )aj6i; 
iJ 
x S> fleia 


From Equations 5.7.1 and 5.7.2 it follows that f(z) = (z, y) for all z € R". 
Now, suppose z is such that f(z) = (z,z) for all z € V. Then, 


f(z) = (z,z) = (x,y). 


Hence (z,z—y) = 0 for all x. In particular, for z = z — y, we have 
(:-y,z-—y) = 0. But then z-y=0 or z=y. Hence this y is unique. 
0 


Did you recognize Exercise 5.2.6 and its solution towards the end of this 
proof? 


(r,y 


(5.7.2) 


We now give a geometric proof of the Riesz Representation Theorem. 


Proof If f = 0, then the obvious choice is y = 0. If f # 0, then f isa 
linear form and W = ker f is of dimension n — 1, where n = dimV. Thus 
there is a unit vector u perpendicular to W, for V = W@W* (that is, uisa 


Scanned by CamScanner 


116 Inner Product Spaces 


unit vector normal to the “plane”). y must therefore be a multiple au of A 
The choice of a is determined by the equation f(u) = (u,y) = (u,au) = i" 
Thus we take y = au where a = f(u). For z € V, we have z = yw 4 tn, 
where w € W and ¢ € R (see Theorem 5.6.1). Then ; 


f(z) = f(w+ tu) = f(w) + tf(u) = tf(u). 
Also, 
(z,y) = (w + tu,au) = a(w,u) +ta(u,u) = ta =tf(u), 


Hence the result. 


5.7.1 Hyperplanes 


This section is an extended discussion of the geometric idea introduced jn 
the second proof of Riesz representation theorem. 
We shall start with a geometric definition of a plane in R°. 


Definition 5.7.1 A plane in R® through a point p with normal N is the 
set of all lines passing through p and perpendicular to N. We denote such 
j a plane by II(p, V) (see Figure 5.7.2). 


V 


Figure 5.7.2 Planes in R°. 


Let X € II(p,N) be a point on the plane. Then the direction 
vectors t(X — p) of the line &(X,p) are perpendicular to N if and only 


— $a 


Scanned by CamScanner 


if (t(X - p), N) = 0 for all t, that is, if and only if ((X — p),N) =0. This 
happens if and only if (X,N) = (p,N), or if and only if (X, N) =d, where 
d= (p,N) is a constant. 

In R°, let N = (a,b,c) # 0, X = (z,y,z) and p = (x5,¥0,20). Then 
(X,N) = az + by + cz. Hence (X,N) = d is equivalent to the linear 
equation az + by + cz = d, where d = azg + byy + czp. Thus, from our 
geometric definition of a plane we see that any plane is given by a linear 
equation. Conversely, if a nontrivial linear equation ne a;z, = b is given, 
then it defines a plane whose normal is (a), a2, a4). 

This shows us how to define analogous objects in higher dimensions. 


Definition 5.7.2 Let V be an inner product space. Fix a nonzero vector 
N €V and areal number b. Then the set I:= {ze€V | (z,N) = b} is 


called a hyperplane in V with normal N. 


Thus points in R, lines in R? and planes in B® are hyperplanes. 

However note that in Chapter 3 we were able to define a hyperplane in 
any vector space, not necessarily inner product spaces. Does this notion 
coincide with that we defined earlier in Chapter 3? The answer is yes. 

First we show that any hyperplane according to this new definition is | 
a hyperplane according to Chapter 3. There are two ways of seeing this. 
Let II be a hyperplane in V. If we fix a point p € II and take W to be 
{reV | (, N) = 0}, then Il = p+W. The other one is to show that the 
line joining any two points of II is contained in II. If p; € II, for i = 1,2, 
then (p;,.NV) = b. So (tp, + (1 — t)p2, N) = b. 

Conversely, if II is a hyperplane according to our earlier definition, say, 
of the form II = z + W where W is a vector subspace of dimension 
dim V — 1. By orthogonal decomposition theorem, we can find a nonzero 
vector N LW. If 6:= (z, NV), we can easily show that 


M={veV | (v,N)=0}. 


Thus our new definition gives a geometric characterization of hyperplanes in 
an jnner product space. Another way of saying this is that any 
hyperplane is of the form f;*(d) for some nonzero a € V and dé R. Let 
u be the unit vector along a. Extend it to an orthonormal basis of V, say, 
{uj,-++ Un—1;Unu}. Let 2, be the coordinates associated with thls basis: 
z,(v) := (v, vj). Then the hyperplane f>1(d) has the following description: 
{veV | zn(v) = |lalld}. 

In the rest of this section, we want to find a formula for the distance 
d(v, 1) := inf{d(v,w) | we W} of a point v to a hyperplane II. If the 
reader wishes, he may assume V = R” (or even R?) with the dot product. 

We shall first derive a formula for d(v,W) if W is a vector subspace 
of dimension n — 1. We have seen in Section 5.6 that if W is any vector 


Scanned by CamScanner 


118 Inner Product Spaces 
Fa chestoeseeengiescucaiaahencenia secu aaeie e e eee 


subspace, then d(v, W’) = |v — Pyy(v)||. We can say more if we assume that 
dim W = dim V-1. Let V = WalV+ be the orthogonal decomposition. By 
Lemma 5.6.2, dim W+ = 1. So, if W+ = Ru, choose a unit vector N € W+, 
Then N is of the form +(v/ ||v||). Note that v = Pyy(v)+(v, N) N. (Why?) 
Thus 


d(v, W) =|] — Pry(v)|] = [I (v,N) NU] = |(v, A), (5.7.3) 


where NV is a unit vector orthogonal to W. 

We now specialize Equation (5.7.3). Let V = R° and W be a plane 
passing through the origin. Then there exist a,b,c € R, not all zero, such 
that IW = {(z,y, z)eR | ar +by+cz= 0}. W is a vector subspace of 
V and we have R? = W @R(a,b,c). In geometric language, (a,b,c) is a 
normal to the plane I. We take as unit normal N = (a, b, c)/Ve+eee. 
Then Equation (5.7.3) reads 


d((z,y,z),W) = ((z, y,z),N) 


~ (a, b, ¢) 
(cone (a? + b? + 5} 


az + by + cz 


Vash e) 


This is a well-known formula from analytic geometry of three dimensions, 

To treat the general case, we shall exploit the translation invariant nature 
of the distance d on V (Exercise 5.1.11). 

Let us now assume that the hyperplane is given by (2, u) = d for a unit 
vector u. Let W :=kerfy ={zEV | (z,u)=0}. Then W is ann-] 
dimensional vector subspace with u L W. Let p € II be any arbitrary point. 
Then W =I - p (verify this). We know that d(v — p,x — p) = d(v, 2) for 
all v,z,p € V. Hence, we see 


inf {d(v,z) | c€M}=inf{d(v-p,z-p) | ce W} 
= inf {d(v-p,w) | w€ II} 
= |(v — p,u)| by Equation (5.7.3) 
= |(v,u) - (p,u)|. 


(5.7.4) 


If the hyperplane is given by f y (d) for some nonzero N, then by taking 
u = N/||N|| in Equation (5.7.4), we see that 


a(v,Tl) = TH Iv, N) ~ (p,N)). (5.7.5) 


Scanned by CamScanner 


Linear Functionals and Hyperplanes 119 
CO OOOGQOQQGSG 


As earlier in the case of R® and II = {(z,y,z) | azr+by+cz = d}, 
Equation (5.7.5) becomes the well-known formula 


_ jaz+by+cz—-d| 
Vithre 


The rest of the section may be omitted as it derives the formula for 
d(p, I) in two more different ways. They are included here just to show 
the different approaches possible to attack the same problem and also to 
establish the supremacy of the above approach. 

Let II be a plane given by (X,N) =d. Let q€ V\T. Our aim is to 
compute d(q, IT) := infpen d(q,p) = infpen ||q — pl. 

If p € II is such that d(q,p) = d(q,II) we claim that q — p 1 TI. Grant 
this claim for a moment (see Figure 5.7.3). Then g — p = aN for some 
a €R. Hence (g—p, NV) =a(N,N) = a ||.N II? so that 


lq —p,N)| = lal INI)? = |]¢ — pil |v II. 


d((x, y, z), TT) 


It follows that 
\(q —P, N)| 

NII 
This reduces to the standard formula seen in analytic geometry. If we 


assume that the plane is given by the equation az + by + cz = d, then we 
let NV = (a,b,c) and g = (z,y,z). The above formula then becomes 


d(p,q) = ||p—q|| = 


_ jax +by+cz-—d| 
. Vera 

It remains to show that q— p II. As was done earlier, we use calculus 
to prove this. 

Let v be such that (v, V) = 0. Consider p+ tv. Then p+ tv € TI. For, 
(p+tu,N) = (p,N) +t(v,N) =d+0=d. 

If we set f(t) := ||p + tv —q]|°, then 


f(t) = (p-—q+tv,p— q+ tv) 
= (p —q,p—q) + 2t (p—q,v) +t? (u,v). 


d(q, 11) 


Since p is assumed to be nearest to q and f(0) = ||p—4q||*, we see that f 
has a minimum at t = 0. Thus f’(0) must be zero. Since 


f'(t) =2(p—q,v) + 2t (u,v) 


we have 0 = f'(0) = 2(p—q,v). Thus p—q 1 v. Since v is any vector 
perpendicular to N, we see that p—q =aN for some a ER. 


Scanned by CamScanner 


Inner Product Spaces 


asec lineata 


Figure 5.7.3 Distance between q and II. 


The crucial question is: How do we know that there exists one such 
p€ II? Our entire analysis hinged on the existence of one such point. Here 
you need topology or analysis (depending on your preference) to answer 
this question affirmatively. 

We now give a highly geometric proof which gives us the point and also 
lets us calculate the distance d(q, Il). 

By geometry (see Figure 5.7.3) we expect that the point of intersection 
of the perpendicular dropped from q to II will be the required point. The 
perpendicular is the line through q with direction vector N. 

Let us find the point of intersection p of the normal line é(q; N) to the 
plane and the plane. Then p is of the form q+ toN for some t € R. The 
point p lies on II if and only if (p, NV) = b, if and only if (q+ toN, N) =, 
if and only if (9; N) + to (N,N) = 6 if and only if tp = ag hR, Thus the 
required point is p:=q+toN. 

r Now if p’ € Il, then 


(p' —pyp- q) = (p', to) - (p, to) = tob — tob = 0, 


Thus p — p’ and p—q are perpendicular for any p' € II, Therefore, by 
Pythagoras theorem Lemma 5,2.1, we have 


2 
llp' -p+p-all! =|lp—p'|l’ +p —q||. 


This shows that ||p — q|| < ||p’ — q|| and equality holds if and only ifp =p’, 


Scanned by CamScanner 


Orthogonal Transformations 121 


Furthermore, 


d(q, 11) = d(p, q) Ilp - ie 


[xm 


= lat i T Li 


\(g, N) a b| 
IN|] 


[ hope you enjoyed secing how we turn geometric ideas into rigorous proofs. 


5.8 Orthogonal Transformations 


V denotes an inner product space of dimension n unless stated otherwise. 
We look for linear maps T’: V - V which “preserve” the extra structure 
(,). So we make the following definition: 


Definition 5.8.1 Let V be an inner product space. A linear transformation 
T: V + V is said to be orthogonal if (Tz, Ty) = (z,y) for all z,y € V. 


Note that this definition implies that ||Tx|| = ||z|| for all z € V. What 
is surprising is that this is sufficient too: 


Theorem 5.8.1 Let T: V + V be linear. The following are equivalent: 
(1) T is orthogonal. 
(2) |||] = ||2I| for all 2 € V. 


(3) T takes an orthonormal basis to an orthonormal basis. That is, if 
{e;}7_, is an orthonormal basis, then {Te;}"_, is an orthonormal 
basis. 


Proof. As observed, (1) implies (2): If T is orthogonal, then 
(Te, Ta) = (2,2). 


The left side is ||T'x||? and the right side is ||2x||* and they are equal. Hence 
(2) follows. 

To show (2) implies (1) we need to prove that if ||Tz|| = ||z|| for all 
zéV, then (Tz, Ty) = (x,y) for all z,y € V. This is an often encountered 
idea. We know something about ||z|| and we want to say something about 


Scanned by CamScanner 


122 Inner Product Spaces 
a eS 
(z,y). To get these “cross-terms”, the idea is to exploit what we know 
about ||z + y|| or ||z—y||. We have 
2 
(z+yzty)=llalP+liyl+2(y)  forz.yeV. (5.81) 
We also have 


(T(z+y),T(z+y)) = (Tz+Ty,Txr+Ty) by linearity of T 
= (Tz,Tz) + (Ty, Ty) + 2(Tz,Ty) 
= ||Tz\f’ +|ITyll’ +2(Tz,Ty). 


By hypothesis ||7'z|| = |x|], ||Tyl] = llyl] and ||T(z + y)|I = |lz + yl] so 
that this yields 


2 
Iz+yll? =!T(e+y)IP? =HzIl + Ily! 


°+2(T2,Ty). (5.8.2) 


Comparing Equaiion (5.8.1) with Equation (5.8.2), we get (Tz, Ty) = (z, y). 
Thus (1) is proved. 

We prove that (3) implies (1): Let {e;} be an orthonormal basis of y. 
We are given that {Te;} is again an orthonormal basis of V. We are to show 
that T is orthogonal. Let z= }> zie; andy = } yje;. Then Tz = )'2,Te, 
and Ty = ) y;Te; (by linearity of T). Hence 


(> atau) by linearity of T 
i 5] 

> TiYj (Te;,Te;) 

i) 

So zin6is since {Te;} is orthonormal 


4) 
yori 
i 


= (z,y) by Exercise 5.5.2. 


(Tz, Ty) 


Hence T is orthogonal. 

Conversely, if T is orthogonal, then (Te;,Te;) = 4;;. So {Te;}"_, is an 
orthonormal set. By Lemma 5.5.2 it is linearly independent. Since it has 
n elements, it is a basis. Hence {Te;} is an orthonormal basis. Thus (1) 
implies (3). 

We have shown that (1) is equivalent to (2) and also to (3). Thus (1), 
(2) and (3) are mutually equivalent. 


0 


Scanned by CamScanner 


Orthogonal Transformations 283 


Lemma 5.8.2 If f: V — V is any map such that 


(1) f(0) = 9, 
(2) f(z) - f(y) I] = Iz — yl, 
then f is an orthogonal linear transformation. 


Remark 5.8.1 This result is really very amazing. Note that the second 
‘ondition means that d( f(z), f(y)) = d(z,y) for all z,y € V, that is, f 
preserves distances. Thus a map f: V + V which preserves the distances 
and maps the zero vector to itself is necessarily a linear map and also 
orthogonal. Recall that distance is twice removed from the inner product. 
From the inner product, one gets the norm and from it the distance. The 
bond scems to be so strong that the “distant” cousin forces linearity. Note 
that the result is no longer true if we do not assume f(0) = 0. For, if we 
take f = Tas the translation by a nonzero vector a € V, then f preserves 
distances and it is not linear (see Exercise 4.1.2 and Exercise 5.1.11) 


Proof From (1) and (2), we have 

I F(z)Il = NF (@) - Oll = F@) - FOI = |le-Ol] = |Iz]]. (5.8.3) 
that is, || f(z)|| = |||] for all z € V. Hence 
is(e)- FIP = (F@)s F(@)) - 2 (F(2), F()) + (FW), Fly) 
I F(Z)IP + LF (Y)IP - 2 (F(@), Fy) 
lll? + |Iyll’ -2(f(2), f(y)), by Equation (5.8.3). 
Using (2) again, we get 

Iz — yll? =| F(x) - f(y) |I°. 


But ||z - yl[? = zl? +llyl|l?-2 (z,y). It follows that (f(z), f(y)) = (x, y). 
Thus f preserves the inner product. In particular, if {e;}7_, is an orthonor- 
mal basis, {f(e:)}#21 is an orthonormal basis too. Now, let z = Y aje; € V. 
Since {f (ei) }fL1 is an orthonormal basis of V, 


fic) = > (F(z), f(ei)) f(e:) by Theorem 5.5.1 
> (a, €;) f (ei) 
= So aif (ei): 


This means that f()> 2iei) = do zi f (ei). That is, f is linear. 


Scanned by CamScanner 


124 Inner Product Spaces 


The following corollary identifies all distance preserving maps f: V V. 
It says they are compositions of an orthogonal linear transformations and 
translations. 


Corollary 5.8.3 Let g: V > V be such that ||9(z) — g(y)|| = ||z - ull for 
all z,y € V. Then there exists a unique v € V and an orthogonal linear 
transformation A: V + V such that g(z) = Ar+v for allz eV, 


Proof Take v = 9(0) and f(z) = g(z)-9(0). Then one easily checks that 
f satisfies the hypothesis of Lemma 5.8.2 and hence f(r) = Az for some 
orthogonal linear transformation A: V > V and that g(r) = Ar +, 


0 


Exercise 5.8.1 Let V be an arbitrary vector space. Let Ty, denote the 
translation by v. Prove the following: 


(1) Ty’is a bijection for all v € V such that T>' = -7,. 
(2) Tyey = Ty + Tw for all v,w € V. 
Translation is the device we employ to “shift the origin”. 


Exercise 5.8.2 Let A be an orthogonal linear map of V, 7, a translation, 
What is the inverse of Ao 7,? What are AoT,, Tyo A, Ao T,o Am), 
T-!oAoT,? Note that in general these are not linear maps. 

If you know some group theory, the answers to these questions should 
tell you that the set 


{AoT, | Ais any orhogonal map of V and v € V} 


is a group, called the group of rigid motions. This group is nonabelian, 
The set {T, | v € V} is a normal subgroup. 


Let V be an inner product space and let J: V + V be an orthogonal 
linear transformation. Let {e,... ,én} be an orthonormal basis of V. Then 
{Te,,..- ,Ten} is also an orthonormal basis of V. Let T(e;) = Y ajje;. 
We know how to find ajx: (Tei,¢k) = Lo, aij (€j,€k) = aig. Therefore the 
matrix M(T) with respect to this orthonormal basis is given by 


(T6503) os» (Tey 8}) M1 G2)... Oni 
“ae (Perea de WPens ea _ os Q72... Qe 
(Tersen) de (Ten, en) ain a2 Ann 
= (C1)... Cn); 
where C; is the ith column of M(T). We think of C; as a column vector in 
R”. 
eee ee 


Scanned by CamScanner 


1 Transformations 
Orthogona a 


Since T is orthogonal, {Te;} is an orthonormal basis. Hence 
(Te;,Te;) = 0i3. 


since Tei = bj aijej, We have 


(Te;,Te;) = (Dawn Denes) 
i So aireejs (er, es) 
= 2 eiraj0hr = 2 Airdyr = C;- Cj, 


the dot product of the column vectors in R”. 
This suggests the following definition: 


Definition 5.8.2 An n x n matrix A = (aj;) is said to be orthogonal if 
C;- Cj = §;;, where C; stands for the ith column of A considered as a 
column vector in R”. 


Note Let V be an inner product space. If we start with any orthonormal 
basis {e1,--- ,€n} of V and an orthogonal transformation T : V > V ati 
write down its matrix with respect to this orthonormal basis, then the 
matrix is orthogonal. This is just what we have seen and which motivated 
the definition of an orthogonal matrix. 


Exercise 5.8.3 In fact, if an orthonormal basis is fixed, there exists a one- 
one correspondence between orthogonal transformations and orthogonal 
matrices. Can you prove this? If an orthogonal matrix A = (ai;) is given 
we know how to obtain a linear map of V to itself. It is then easy to rhe 
that this map is orthogonal. 


Exercise 5.8.4 A matrix A is orthogonal if and only if A~! = At if and 
only if AA‘ = J. Then we have 


1 = det J = det(AA‘) = det A det A‘ = (det A)?. 
This means that detA = +1 for an orthogonal matrix A (see 
Theorem 6.4.1). 


Exercise 5.8.5 Define an “orthogonal linear map” from an inner product 
space V into another. Prove a result analogous to Theorem 5.8.1. 


Exercise 5.8.6 Show that the choice of an orthonormal basis {v,... , Un} 
for V gives rise to an orthogonal linear map T' from V to R": Ty; := e; for 
1<i<nand extended linearly. 


Scanned by CamScanner 


126 Inner Product Spaces 


_ SS 


Note the analogy. The choice of a basis of an arbitrary vector space 
gave rise to a linear isomorphism from the vector space onto some R” while 


the choice of an orthonormal basis of V gives rise to an orthogonal linear 
isomorphism T from V onto R". 


5.8.1 Coordinates Associated with an Orthonormal 
Basis 


Let V be a finite dimensional vector space over R. Recall how the choice of 
a basis in V introduced a system of coordinates. Given any vector v € V 


we can write it in the form v =,)°, z,u;. We call x; the ith coordinate of 
v. This way we identify V with R" via the linear isomorphism 


U5 (25,0007, 2a GR. 


Now assume that V is an inner product space and that {e;}"_, is a fixed 
orthonormal basis of V. (If you desire you may assume that V = R" with 
the dot product and that e;'s are the standard basis vectors.) Let 2, be 
the coordinates of a vector uv with respect to this fixed (“standard”) basis: 
v= >> a,e;. Let {v;}!_, be another orthonormal basis of V and v = > y, yj. 
We call y,;’s the new coordinates. How are the old coordinates related to 
the new ones? 

Recall that if v = )) ze; then x; = (v,e;). Write uj = 3); vije; where 
vij = (viye;). If vu = )o y;vj, we then have 


ni = (So yjyjes) = Do yy (uses) = Do ysryi- 
j j j 


Do you see the pattern? If we form the matrix 


Uy. U21 eee Un 

Ui2, -U22,—=«« ++ Un? 
Ks:= 

Vin U2n +++ Unn 


then we know that this matrix represents the orthogonal linear map which 
maps e, to v,. Then the coefficients of y,’s in the expression of 2; are the 
elements of the ith row in this matrix. 


We shall put these observations into use in the section on classification 
of quadrics (Section 8.2). 


Scanned by CamScanner 


Reflections and Orthogonal Maps of the Plane 127 
a 


5.9 Reflections and Orthogonal Maps of the 
Plane 


V stands for an inner product space of dimension n. 


5.9.1 Reflections 


Let W be a subspace of V. Assume that dim W = n-1. We wish to define 
reflection with respect to W. A little geometric thinking tells us that we 
must map each element of W to itself and the vectors perpendicular to W 
to their negatives. To put this in a precise form, we use the orthogonal 
decomposition: V = W @Ru where u is a unit vector orthogonal to W. Let 
T:V—V bea linear transformation defined by T(w) = w for all w € W 
and T(w’) = —w’ for all w’ € Ru = W’. If v= w+w’', where w € W, 
w' € W’, then Tv = w — w’ (see Figure 5.9.1). 


e 
~ 
-_ 
a 


hy = hz 


- - oe = - - ee - - - -4 


pe 
= 
~~ a = 


Figure 5.9.1 Reflection with respect to a hyperplane. 
We now want to arrive at a neat formula for T. Note that w’ = (u,u) u. 
Hence 
Tv=T(wt+w’) =w-w' =wtw' - dw! =v- dw! =v-2(v,u) u. 


Definition 5.9.1 Let W be an n — 1 dimensional vector subspace of an 
inner product space V. Let u be a unit vector perpendicular to W. Define 
T:V 3V by T(v) =v—2(v,u) u. Then Rw = T is called the reflection 
with respect to W. 


Scanned by CamScanner 


128 Inner Product Spaces 
———— Et PaCS 


Exercise 5.9.1 The definition of Ry = T does not depend on the choice 
of the unit normal. (Recall that since dim W = n—-1, dimW+ = ] ang 
there are two unit vectors in W-). 

Proposition 5.9.1 Let V be an inner product space of dimension n. Let 
W be a vector subspace of V with dimW =n—-1. Let u be any unit vector 
orthogonal to W. Define T by Tv =v—2(v,u)u. Then T does not depend 
upon the choice of u. Further, T ts an orthogonal linear transformation on 
¥. 

Proof A routine verification. It is best if the reader does it on his own, 
Consider 


T(t) = vy ty—2(v, +02,u)u 
= vi horableveia~ tuna 


v, — 2(vj,u) u+ vg — 2 (v2, u) u 


T(v}) + T(v2). 


For \ € R, consider 

Av, - -2 (Av, u) u 
Av, — 2A (vy, u) u 
A(v; — 2 (v1, 4) u) 
AT (v). 


T(Av;) 


Therefore T is linear. We next prove that T is orthogonal. Consider 


(Tv,Tv) = (v—2(v,u) u,v —2(v,u) u) 
= (vy) — (v,2(v,u) u) — (2 (v,u) u,v) + (2 (v, u) u, 2 (v, u) u) 
= (v,v) —2(v, u)? — 2v, u)? +4(v,u)? 
= (v,v). 
Hence ||Tv|| = \|v||?. Hence ||Tv|| = ||v]] and T is orthogonal. 
0 
The orthogonality of T can also be proved as follows: If {w,,... , Wns} 
is an orthonormal basis of W, then {wy,...,W,-;,u} is an orthonormal 
basis of V and Tu, =u,, 1 <i<¢n-1and Tu=-—u so that T carries an 
orthonormal basis to an orthonormal basis. 
0 


We now find the matrix associated with the reflection p, with respect to 
the subspace Re; (that is, z-axis) in R’. Since 


pz(€1) = 1 and pz(€2) = -ep, 


—— | 


Scanned by CamScanner 


Reflections and Orthogonal Maps of the Plane 129 
—_——— 


(0-1) 


Exercise 5.9.2 Show that pz(z,y) = (z,—y). 


the matrix of pz is 


5.9.2 Orthogonal Maps of the Plane 


\We now look at the orthogonal linear maps of R?. If T is one such, let A 
denote its matrix with respect to the standard basis. Then A is a 2 x 2 
orthogonal matrix. So, it suffices to find all orthogonal matrices in M(2,R). 


Let ptt 
“"\b d 


be an orthogonal matrix. Therefore AA* = J and hence a? + ¢? = ] and 
s2 4d? = 1. Further ((a,c),(b,d)) = 0. Now, a2 +c? =] implies that 
there exists unique 6 € [0,27) such that a = cos6, c = sin§. Therefore 
(a,c) = (cos@,sin@) and since (a,c), (b,d)) = 0, we get that (b, d) 
+(-sin§,cos6). Thus we have 


Ra = (°° @ —sin@ 
°~\sin@ — cos6)? (5.9.1) 
or 
_ [{cosé sin 6 
Pe = \sin@ —cos@) (5.9.2) 


The transformation represented by Equation (5.9.1) is called a rotation by 
an angle 6 (see Figure 5.9.2) and that represented by Equation (5.9.2 
called a reflection (see Figure 5.9.3). 

The latter is called a refiection since it is the reflection with respect to 
the line R(cos fe, + sin $e2) = {(tcos$,tsin 2) | t ¢ R}. Let us denote 
this one-dimensional vector subspace by W. A unit normal uv to this line is 
civen by u = (sin g,—cos $) (a vector perpendicular to (cos §, sin 8), 
Let Ry be the reflection with respect to W. Let us compute 


) is 


Rw(ei) = €1-—2(e1,u)u 


_ 6/ . 6 8 
— 2sin — od acs x 
(1,0) in5 (sin 517 COs 5) 


29 6 @ 

teint” e698 0 

(: 2sin 5 Pain 5 cos 5) 
(cos 6, sin 6). 


Scanned by CamScanner 


Inner Product Space 
c 46 


Figure 5.9.2 Rotation. Figure 9.9.3 Reflection with 
ra respect to a line. 


Rw(e2) = €2-2(e2,u)u 


6/ . 8 6 
(0,1) + 2cos 5 sin =,— cos = 


2 2 


6. 8 2 9 
(2ens sin 1 ~ 2eos 5) 


(sin 6, - cos 6). 


Thus the matrix of Ry with respect to the standard orthonormal basis 
{e1,¢a} of R? is (Riv(ex),Ruy(€o)) = (ee Be 

We could have arrived at this result in a geometric way. To reflect with 
respect to the line W is same as the composition of the operations. Rotate 
R? (and hence the line W) by an angle —§/2, reflect with respect to the 
z-axis and rotate back by an angle 6/2. This composition is the product of 
the associated matrix 


. Hence the result. 


Scanned by CamScanner 


6. Determinants 


Recall that a determinant of order two is defined by the formula 


ZZ) Yi 
T2 > Y2 


| = T1Y2 —- Toy 


In other words, the above determinant is a number assigned to the matrix 


Ti) VY 
Z2, yo) 
In this chapter, we define the determinant of a Square matrix. This 
definition will be arrived at by treating the above concept in a geometric 


way. 


6.1 2x 2 Determinant as Area of a 
Parallelogram 


From school geometry, one knows the areas of some elementary figures in 
the plane such as squares, rectangles, right-angled triangles etc. If one 
wants to find the areas of slightly more complicated subsets of R?, one 
appeals to intuitively satisfactory assumptions such as the following: 


(1) The areas of rectangles, right-angled triangles are given vy the 
well-known formulas. 


(2) The area of any figure which consists only of line segments is zero. If 
A is the union of two subsets B and C such that BNC is built up of 
line segments, then the area of A is the sum of the areas of B and C. 


(3) If the given area is cut into elementary subsets of known area and 
rearranged, then the area of the original figure is the same as that of 
the rearranged figure. 


131 


Scanned by CamScanner 


132 Determinants 


Using such ideas we can arrive at the area of a parallelogram as in high 
school geometry. Consider R?. Let 2 = (21,22), y = (Yi, y2) € R? be any 
two nonzero vectors. We want to compute the area of the parallelogram 
spanned by these vectors. We compute its area in two ways. The first one 
will be of importance to us later. 

Look at Figure 6.1.1. We can rearrange the given set into a rectangle 
by cutting away a right-angled triangle and pasting it on the opposite side. 
Hence the area of the parallelogram is “the base times the height”. We find 
expressions for these in terms of the data given. 

Let the (orthogonal) projection of y on z be z, that is, z:= y — (yz) 


(z,z) 
Let A = |:|. From plane geometry, we have 


h = |ylsin@ 
= |y| V1 —cos?6 
(z,y)" 
= ly l= 2,2 
Iz\"ly| 


= HL hei — te, 9)? 


illu 
7 Vier - (2,y)?. 


Figure 6.1.1 Determinant as signed area (1). 


dw, the area of the parallelogram (see Figure 6,1, 1) 


A= (base)h = |2| aV l2l"Iyl? - (2, y)?, 


Scanned by CamScanner 


9 x 2 Determinant as Area of a Parallelogram 133 


Therefore, 


A? 


|x|" |yl? — (x,y)? 
(r1y2 = r2y,)? 
(det(z, y))? 


where (z,y) stands for the matrix whose first column is the column 
vector z and the second column is y. Hence the area of the parallelogram 
is [7192 - z2y,|. We may thus think of det(z,y) as the signed area of the 
parallelogram spanned by z and y. 


rt+y 


Figure 6.1.2 Determinant as signed area (2). 
The second computation runs as follows: In Figure 6.1.2, the full area is 
1 1 
4142 + y2t + grit 


where Z = (21,72), y= (yi, 2). 
In Figure 6.1.3, the area of the shaded portion is 


1 l 
guit2 +Zoy1 + 9 ¥1¥2: 


Therefore the area of the parallelogram is |z,y2 — r2y,| = det(z, y). 
Thus we may think of the determinant of a 2 x 2 matrix 


(9 


as the signed area of the parallelogram spanned by the two column vectors 


(*) and (;) in R?, 


Scanned by CamScanner 


Extra area 


Figure 6.1.3 Area of a parallelogram. 


Let us look at this view a little more closely. If the vectors are linearly 
dependent — in this case one is a multiple of the other — they a 
a one-dimensional area and hence the area is 0. This is true, since if we 


assume that (') is a scalar multiple of (' then 


a b 
ai (' i) =o 


If the vectors are the standard basis, then they span the unit square whose 
area is 1. We immediately verify that this is so, as 


1 0 
at ( y=} ) 


If we dilate one of the sides, say, by a factor of 6, we expect that the area 
of the resulting parallelogram should be 6 times that of the original figure: 


ba b a’ b 
aa i) sae (t At 


However, note that we have 


01 
de(; 1)=-! 


Even though the parallelogram spanned by e2 and e; is also the unit 
square, we get its area as -1! Thus it is clear that we are looking not 
only at the final geometric figure, namely, the parallelogram spanned by 
the vectors but also at the order in which the vertices are given. 


Scanned by CamScanner 


135 
. 2 ae 
Consider R? with the standard basis {e;,e2}. Let A b 

os pe ne agn by Tener Lebar Sera oe area of the 

a basis of R°. : R? ‘ = e;,0= 
gee bet o et qT: R? - R? be the linear map such ine ule 
and Te: = w. The matrix representation of 7’ with respect t fe Te; =v 

O {€1,€2} is 


T= & ba 
v2 W2 : 
Then the area of the parallelo 
gram [v, w] spanned b 
= - y uv and w is giv 

eee! 2 ace \det(T)| A. Thus the linear ma 7 is given by 
area : Q : : ‘ factor idet(7)|, that is, Area [v,w] = id distorts the 
(see Figure 0.1." ) If vj are linearly dependent, then they ae a 
segment not a yo-dimensiciia! object. Hence in this y span only a line 
s zero. Thus det(7)) is the factor by which the area sie — of T(Q) 

' nit square in R? 


Itiplied to get ar : 
an ee a wa area of T(Q). This is the geometric i 
of the determinant of a square matrix of size 2. We shall melee: 
. all return to thi 
is 


theme later. 


€2 


1 
Figure 6.1.4 Geometric meaning of the determin 
ant. 


In the next section, we shall d 
efine . 
observations. the determinant based on these 


g.2 Determinant and its Properties 


Let V be any vector space of dimensi 
hig? sion n. L n 
the product of V with itself n times. The a stand for VV x--.x V 
if he so wishes. may assume that V = Rn 
\We wish to define determinant as 
é a function whi 
tuple of vectors (v1, -..yUn) € V" a real number ze attaches to any n- 
thought of as the signed volume of the parallelepi ed us number is to be 
is a parallelepiped in V? ped spanned by v,’s. What 


Scanned by CamScanner 


Determinants 


ee 


Definition 6.2.1 Let {uv}, CV. The parallelepiped [v,... , Un] spanned 
all 
by {u,}fy is defined as the set {ve V | v= Die avi 0S a; $1}. 
The vectors v; are called the vertices of the parallelepiped. 
Note that all a, are allowed to vary between 0 and 1. For instance 
! 


v} eet Un € [v,.. Un] 
When n = 2, a parallelepiped is a parallelogram. If n = 3, the picture js 


Zon AY 


Figure 6.2.1. 


f Figure 6.2.1 Parallelepiped in R°. 


a 


Given n vectors {v1,...,Un}, we form the parallelepiped [v,,... iM). 
We want to think of the determinant as a function which assigns “signed” 
volume to each parallelepiped [v;,... ,Un]. Thus the determinant should 
be a real valued function from 


n times 


In any kind of measurement we need a unit against which others are 
measured. In our case this means that we have to make a choice of a 
parallelepiped and declare its volume as 1. Will any n vectors [v,,... Un} 
do? No! For, if they are linearly dependent the parallelepiped lies in a 
vector subspace of dimension at most n - 1 and hence its n-dimensional 
volume must be zero. Thus, it behoves us to choose a basis {e; m1 oO8V2 
For the remaining part of this discussion, this basis will be fixed. 

Based on our geometric intuition, we expect this function det: V7 4R 
to possess the following geometric properties: 


o=— (P1) For all a € R, and for all i, 


det(vy, 02)... QUj,.++ Un) =@ det(vs,...,Ui,-.. Un). 


(P2) For i 4 j, det(vy,... Ui Ujs-++ Un) = det(uj,... Ue +U;, v;,... Un). 


(P3) If {e;,... én} is the chosen basis of V, then det(e,,... ,en) = 1. 


Scanned by CamScanner 


Determinant and its Properties 87 


Remark 6.2.1 We make some remarks on the geometric contents of the 


conditions (P1) through (P3). 
Condition (P1) is the mathematical rendition of the principle: The 


volume is magnified by @ if any one side of the parallelepiped is magnified 


by 


. 


a. 
(P2) together with (P1) is the mathematical dressing of the principle: 
The volume or area is unaltered by cutting and rearranging to get a simpler 
geometric figure whose area or volume is easily determined. This is what 
we did while computing the area of the parallelogram geometrically (see 


(2) in Theorem 6.2.1). 
(P3) is a normalization condition which is always needed in any kind of 


measurement. 


A aA A 
Figure 6.2.2 Magnification of geometric figures. 


. °4 } ‘ . 
Assuming the existence of such a function “det”, we derive some of its 


properties. 


Theorem 6.2.1 Assume that there exists a function det: V" + R with 


| 
the properties (P1), (P2) and (P83) listed above. Then | 
(1) det(Ziy-++ Tires + 9 75r-° tn) = 0 if rj =0 for some i. | 
(2) Fori#j anda eR, | 
det(vj,-+- 1 Vin Yjy+-+ Un) = det(vy,... , y; + aU;,..., Un). 


(2') More generally, we see that, for any a; ER, j #i,' 


det(v1,+-> y Vises » Up) a det(v),... Yj-1, 0; + aj0;, U:41,... Up). | 
j#i 
(6.2.1) | 


| 
Scanned by CamScanner 


| 


138 Determinants 
(3) det(vy,... Un) =O if {uy,... Un} ts linearly dependent. In particy. 
lar, if yy =v, for some i # j, then det(y,... Un) = 0. 
(4) For any j € {1,...,n} and for any vj, uj we have 


det(un,... Uj U;y.0 Un) = 
" 
det(v,,..6 5 Upye++1 On) tdet(vise.s Uz sees 9Un)s 


(5) det(vy,... ,Ujy.e0 pVjyees Un) = —Aet(t,... Ujyees Vines+ Un), for 
Proof (1) is easy. For, in (P1) we can take a = 0: 


det(x),... ,Q2j,--. ,Z_) = adet(z,..- ,2j,--- 5 En): 


To prove (2), assume without loss of generality that i < 7 for simplicity. 


ee ee 
= det(y,.+-avj,.-. ,Un) (by (P1)) 
= det(r,... ,¥; + avs, QU;,... ,Un) (by (P2)) 
= adet(r,.:. ,vj + av;,... ,Vj,--- ,Un) (by (P1)). 


Since this is true for all a € R, (2) follows. (2’) follows easily from (2). 
We now prove (3). If {v;,... .t,} is linearly dependent, then there exists 
? such that v, = }> .,a,v,;. Using (2) repeatedly, we get 


—J) 


det(v;,..- ,Ui)--- Un) 
= det(r;,...,0;-@;vj,---,Un) for j #1 by (2) 
= det(,,... ,t; —@jvy — @xi,--- Un) for k # 1,7 by (2) 
= det(vj,...,.4-— ) ajtys- in) 
jst 
= det(r;,... pi ee Un) 
= 0 by (1). 
We now prove (4). If v} and vi are both linearly dependent on 
{Pigenr +Ujy—1, Ujsi:--- Un | 


there is nothing to prove as both sides of the equation are zero by (3). 
Therefore assume that one of them is linearly independent of 


§005 52 0:0 (M40 p41) 20 Da} 


—| 
Scanned by CamScanner 


Determinant and its Properties 139 
a 


Assume without loss of generality that {u,... ,Uj—1,0j,... Un} is linearly 
independent. Hence it is a basis of V. So we can write vi! = ayvy-+Y where 


Y E L({vy,... y Uns Upsayees 1 Un}) 
is the linear span of {vj | 1 <i <n,i 4 j}. Therefore, 


det(viy-++ Yj + Uj ps6 Un) 
= det(uy,... 0; + ajv; +Y,... Um) 
det(vi,.++ 5 Uj + jUj,-.. Un) by (2') 
= det(vy,... (1 +a;)vj,.+ Un) 
= (1+ aj) det(v,,... ,Uj,--- Un) by (P1) 
= det(vi,... Uj,.-+- Un) + aj det(;,... Uren qal 
= det(ti,.++ .Ujr+++ Un) + det(u;,... O50)... , Un) by (2) 
= det(ti,.-. Ujs+-+ Un) +det(n,... ,ajvy + ¥,...,un) by (2') 
= det(,... )Vjr-++ Un) +det(y,... ,v7,..., Un). 


We prove (5). We consider det on (vj,... , 0; +0j,... 0; +0;,...,Up,) and 
use (4). 
Q = det(vr,-.- Vit Ujs-+- Vit Uj,--- Un) 
= det(ni,.-- Bit Vjy-+- Vjyee- »Un) + det(v),... age Oy)h TUE <6 Ua) 
= det(v1,.-- .Viv-+- sUjr--- »Un) +det(v,... UZ yisroa’) Usp 0,2» Un) 
+ det(ty,.-- :Vir-++ sVin--- Un) + det(yz,... ,u;,... , Uj)... 05) 


soltlipect2tiises spe Un) + det(v,,... Ug 7632.4 Uh sae Bah 


as det(¥yo0= Bios sBine++ Ya) =O by (3). 
0 
Properties (P1) and (4) put together imply that the function det is 
“inear in each variable”. This means that if we keep all variables v; except 
the ith variable fixed, then it is a linear map in v;. To be precise we have 


the following 


Definition 6.2.2 Let V be any vector space. An r-linear map is a function 
{: V" +R such that for each i, 1 <i <r, the following are true: 


| 

| 

f(t1,--- Ui + Wis--- 1 Ur) = F@igess gUis cers »Ur) + f(uy,... 9 Wiysiss » Ur) 
f(v,--+ sQVis--- 1 Ur) OG yas, Miges< 4B) 


for all vj,w; €V andaeR 
If f is Qlinear (respectively 3-linear), then we say that f is bilinear 
(respectively trilinear). 


| 
| 
| 


Scanned by CamScanner 


140 Determinants 


Example 6.2.1 Let V be an inner product space. Then the map 
f(x,y) = (z,y) 


is a bilinear map. 


Example 6.2.2 Let V := R? and f(z, y) = 21y2-Z2y1. Then f is bilinear, 


Proposition 6.2.2 Jf det: V" > R ezists satisfying the conditions P1 
through P3, then det ts n-linear. 


Proof If such a det exists, it enjoys the properties (1) through (5) listed 
in Theorem 6.2.1. The Proposition then follows from (P1) and (4). 


A more general formulation of (5) is true and it involves certain facts 
about the group of permutations on n symbols. Let S, denote the set of 
permutations (bijections) of the set {1,...,n}. In order not to interrupt 
our discussion, we shall assume the following facts about S, as known to 
the reader. Proofs may be found in any book on algebra or a book on group 


theory. 
1) The number of elements in S,, is n!. 


( 
(2) If o and 7 are in S,, then their composition ¢ 0 7 also lies in S,,. 


(3) If o € S,, so does its inverse o~}. 


(4) A standard way of writing o € S, is 
my ae a ee 
7=\oe(1) o(2) ... o(n))’ 


(5) A permutation, which switches exactly two elements and leaves the 
rest unaffected is called a transposition. If o is defined by o(k) = k, 
ifk#~iandk #j, o(i) = 7 and o(j) =i, then in the above notation 


it is written as 
] eee 1 oer j eee n 
t dec J ved BF hee ME* 
However, in this case, we use a still shorter notation o := (ij), 


(6) It is known that any permutation is a product (composition) of 
transpositions. While this product is not unique, the parity of the 
number of transpositions in any such product is well-defined — 
either it is always even or always odd. 


Scanned by CamScanner 


Determinant and its Properties 141 
OE 


(7) If we set sign (a) := (—1)" when a is a product of r transpositions, 
then sign(c) is well-defined. sign(c) is called the sign of a. In 
particular, the sign of a transposition is —1. 


(8) Ifo,7 € Sn, then sign (oor) = sign (a) sign(r). 
(9) Let 7 € Sp be fixed. Then the map ¢ ++ 7 0¢ is a bijection of Sy. 


(10) The map r+ a! is a bijection of S,. 


An important consequence of (8) and (9) is the following observation 
which will be used many times in the sequel: Let f: 5, — R be a function. 


Then 
y f(¢) = > f(a") = >. f(or), for any fixed 7 € S,. 


o€Sn o€Sn, Sn 
Definition 6.2.3 Let f: V” + R beanr-linear map. f is said to be skew- 
symmetric if f(vo(1)s-++ »Vo(r)) = Sign(c)f(vi,...,v,) for all o € S,. 
Exercise 6.2.1 The function f in Example 6.2.2 is skew-symmetric and 
bilinear. 


Proposition 6.2.3 If det: V" — R ezists satisfying the conditions PI 
through P3, then det is skew-symmetric. That is, for any permutation 
g € Sn, we have det(v1,... Un) = sign (c) det(vg(1),... isin). 


Proof This follows from property (5) in Theorem 6.2.1 and the facts (6) 
and (7) about oe 
a) 


We have thus proved that if f: V" + R with the properties Pl, P2 
exists, then such an f is n-linear and skew-symmetric. (We have not used 
P3 so far.) The next result asserts the existence of such maps. 


Theorem 6.2.4 Fiz a basis {e;,...,€n} of V. Then there ezists a unique 


function 
g:Vx:-xVOR 


n times 
such that 
(1) g is n-linear. 
(2) g is skew-symmetric. 
I 


(3) g(é1s-+» sen) = | 
| 


Scanned by CamScanner 


142 Determinant, 
Proof Let v,...,v, ¢V. Write v4, = )> aijej;. Then 
gvj,--- %)=9 > anjeire- 2), Wnialhe 
j: Jn 
- yo a49 (es,. S| 02 j2f2: se > Anintin) by n-linearity 


3 Qj, 025 °** nj, 9(€j,1€j20-°- yey.) by n-linearity 


Jhe-re Jn 
; a 
- ‘y Q}j,Q2;, °°" nj, SIGN 4 j 9(€15++- s€n) 
sine: (Ops 
Cer 


. 1. sew Re 
> sign : . | Q1;,Q2;,°** Anj, | eee B 
Ji +++ Jn 


This equation tells us that if a function g satisfies the properties (1) and 
(2) of the theorem, then it must be of the form 


g(v, sees Un) = 2; sign (a)a16(1) ee Qnc(n)9(€1; see »€n)- (6.2.2) 
céS,, 
We now show that if g is defined by setting 
g(v;,... Un) = > sign ()@1¢(1) *** @no(n) (6.2.3) 
cf&S,, 


where vj = )°5_, aij¢;, then g has the properties (1)-(3). 

We show that 9(v; +0},U2,---,Un) =9(t1,--- Un) +9(UV},--- .Un). Let 
w, = v, + vv} and w, = vu, for r > 2. If we write w; = ‘w Bi;e;, then 
Wij = a1; +a}, where vy = 7, a};e;. We have 


g(wi,... : Wn) = > sign (0) Bre(1) *** Bno(n) 
¢ 


= }F sign (2)(a16(1) + @10(1))@20(2) *** Ano(n) 


Cc 


= x, sign (7) a16(1) ***Ono(n) + > sign (7)a45(1) “** Bno(n) 
e o 


= 9(d,... Un) +9(v},--. Un). 


A similar computation shows that g(av;,v2,...,Un) = ag(vj,... Un) for 
any a € R. Thus g is linear in the first variable. One can either prove its 
linearity in the other variables in a similar way, or derive it from the fact 
that g satisfies (2) of the theorem. 


Scanned by CamScanner 


minant and its Properties 


peter 143 


To prove (2), let 7 = (ij) be a transposition with i < j. Let w, :=y 
We must show that g(w...-- »Wn) = —g(v1,... ,U;,). Let us write 


WU, = > Barer. 
a 


t(r)- 


shen we have 
i an, ifkAiandksj 
Ber = Qjr if k= 
Qir if k — 4. 
We use this below: 


g(wis-- Wn) = So sign (7) By.(1) *** Bren) 
Cc 
a y Sign (7) Q16(1) *** @jo(i) *** Aie(j)*** @ne(n) 
Cc 


= Si sign ()are(1) ++ @ie(5) + @jo(i) *** Onotn) 


Cc 
ey sign (7)Q1¢7(1)**° Fior(i)*** Wjor(j)*** Aner(n) 
Cc 


> sign (97) sign (T)ay¢z(1)*-- Qior(i)*** 


oT 
Qjor(j)°** Inor(n) 
= sign(t) 5) sign (main +++ Qnn(n) 
nESn 
= —9(v1, eee y Un): 
Thus g satisfies (2). (3) is easy and left to the reader. 


We have thus proved that there exists a function g as required and it is 
given by Equation (6.2.3). 


0 
Definition 6.2.4 We call the g of Theorem 6.2.4 as the determinant and 
denote it by det. 


For the purpose of easy reference we list the following properties of the 
function det in the form of a theorem: 


Theorem 6.2.5 Let V be a vector space. Fiz a basis {e,... ,€n} of V. 
Define det: V" + R by 


det(v,... Un) = >; sign (7 )@16(1) *** @no(n) 
o€S,, 


for vj = er aije;. Then det: V" > R satisfies the following properties: 


Scanned by CamScanner 


144 Determinan,, 


(1) det is linear in each of its variables. 


(2) If one interchanges ujand v; fori # j, then the determinants are 
opposite sign. More generally, 5 


det(v9(1),-+» »Yo(n)) = sign(a) det(vy,...,Upn) for anyo eS. 
(3) det(e),... ,en) = 1. 
(4) det(vj,... , Un) =0 if v; are linearly dependent. 


Proof This is left as an instructive exercise to the reader. 


0 
Exercise 6.2.2 Show that any map f: V" —> R which is n-linear and 
skew-symmetric is of the form f = f(e1,... ,€n) det. Hint: This is already 
solved in the proof of Theorem 6.2.4. 


Definition 6.2.5 We define the determinant function on M(n,R) as 
follows: Let A € M(n,R) and write A = (Cj,...,Cn) where.C; is the 
ith column of A. We then define 


det A := det(C},... , Cn) 


where det is the n-linear skew-symmetric function on R® x---x R® 4 R 
_——_—_—-_ 


n times 
with det(e,,... én) = 1. (Here {e1,... ,n} is the standard basis of R"), 
Note that det A = det(Ae;,... , Aen) for A € M(n,R). 


If V is assumed to be an inner product space, then there are natural 
choices of the basis, namely, we would like to choose an orthonormal basis 
{e;} so that [e;,...,@n] is “the unit cube”. In this case it is natural to 
demand that its volume be 1, which is nothing but P3! 

In the next section we show how to compute the determinant using only 
the properties of det listed in Theorem 6.2.5. 


6.3 Computation of Determinants 


In this section, we illustrate how to compute the determinants of matrices. 
We consider vectors of R" as column vectors. We write the given matrix 
Aas A = (C\,...,C,) where C; is the ith column of A. Then det(A) is 
defined by the formula 


det(A) = det(C,,...,C,) where C; € R". 


In the first of the computations, we explain which property of the 
determinant (in Theorem 6.2.5) function is used. 


Scanned by CamScanner 


Computation of Determinants 145 


Qi) 412 
Example 6.3.1 Let i 


be a 2 x 2 matrix. Then 
a,, 412 
det a -) 


a22 


= det(a11€1 + @21€2,@12€1 + @22€2) (by definition) 
= det(a11€1,@12€1 + a22€2) + det(a21¢2,a12€; + a22€2) by (1) 
= det(a;1€1,@12€1) + det(a11€1, @22€2) by (1) 

+ det(a21€2, @12€1) + det(az1€2, a22€2) by (1) 
= det(a11€1, @22€2) + det(a21¢2, a12€;) by (2) 
= 441022 det(e1, €2) + @21412 det(e, €;) by (1) 
= 031422 — 221412 by (2) and (3) 


Example 6.3.2 We compute the determinant of a diagonal matrix: 


a) 0 see 0 
0 Q22 —-:-:- 0 

det . : . = det(a11€1,@22€2,... ,dnn€n) 
0 0 ... @Qnn/ = 11022°-Gnn det(er,... en) 


= @11092°*'Ann. 


Example 6.3.3 We now find the determinant of a triangular matrix: 


Q)1 0 see 0 

a2, 422 0 
det ; 

Qn1 Qan2 eee Qnn 


= det (a11€1 + +++ + Gni€n, Q22€2 + +++ + an2e0,... ; 
Qn-1n-1€n-1 + Qn-1n€n; Ann€n) 
= det(v,...,Un); 
in an obvious notation. 


Ifwe expand the above using multilinearity, say, in the first variable then 
the only term which contributes is e: 


det(v1, v2, er y Un) = det(a},e), U2,... jy): 
Proceeding this way, we see that 


det(ay,@) + +++ + Qn1€1,22€2 +++ + Gn2€n,--- Ane ++++ + Qnnen) 
= det(a,1€1,.-- '@nnén) 
= Q11022°** Ann det(ey,... , en) 


— Q}1°'*ann- 


Scanned by CamScanner 


Deter 
146 minants 


Let us doa couple of numerical examples. 


Example 6.3.4 


= det(e€1, €4; €2) €3) 

— det(e1, €2, €4, €3) 
= det(e},€2,e3, €4) 
= il. 


det 


I 


occ} 
oo = & 
oe COO 


Example 6.3.5 


oo & 


= det (e, + 2e2 + 3e3 + 4e4,5e, + beg + 7e,, 
8e2 + 9e3 + 10e4, €4) 

= det(3e3, 5¢1, 82, e4) + det(2e2, 5e), 9¢3, e4) 
+ det(e;, 6e3, 82, €4) 

= §x3x5x1x det(e;, e2, €3,e4) 
a? KBR OK TX det(e€1, €2, €3, €4) 
-~1x 6x8 x 1det(e;, 2, €3, €4) 

= 48-90+120=-18. 


det 


—- wre 
aoc Cc 
= © &.© 


—_ 
oS 


8 ee 8 
N E04 


b 
Exercise 6.3.1 Compute the determinant of v 
y 


We look at slightly more sophisticated examples. 


Example 6.3.6 We shall give the beginning of computation only, leaving 
the rest to the reader. It may be a good idea to look at special cases n = 2.3 
and gain insight into what is happening. ‘ 


1+ a, a2 clea Qn 
a l+ay ... An 
det . 
ay a2 vi LP Gy 


= det (« +) ayes, ,€ +5“ arej,,... Cn + Sone, 
ya J2 jn 


xX (l+a) +++++ay). 


Scanned by CamScanner 


pasic Results on Determinants 147 


pxample 6.3.7 The matrix of this example is from differential geometry 
and its computation is required while computing the volume element of a 
(hyper) surface given as a graph. The matrix is 


2 

l+2j 23%. «6. 22 
mot, 1+23 ... Zorn 
Pali Oaks “as 1+2? 


The trick here is, as in the last case, to realize the ith column vector C;, 
which is the vector e; + 2iZ, — 


g.4 Basic Results on Determinants 


In this section we establish all the standard results one needs about 
determinants. 
Recall our definition of det(A) for A € M(n,R). Let A=(A),... , An) 
where A; are columns of A and hence we may consider 
A := (Aj,...,An) € R® x--- x R®. 
xxx=_="""-_-—“—/’ 
n times 


It is also worth noting that A; = Ae; where the right side is the matrix 
multiplication of A with the column vector e;. Thus A = (Ae,..., Aen) 
These facts will be used below without explicit mention. 


Theorem 6.4.1 det(AB) = (det A)(det B). 
Proof The most elegant proof runs as follows: Consider 
f:R coop +R 
n times 
defined by 
f(B) = det(AB). (6.4.1) 


Then one shows that f satisfies the properties (1) and (2) of Theorem 6.2.4. 


By Exercise 6.2.2 it follows that f is given by f(B) = f(I)detB. But 
f= det(AI) = det A. Therefore, 


{(B) = det Adet B. (6.4.2) 


Scanned by CamScanner 


Determinary, 


148 —— 


From Equations ( (G.4.1) and (6.4.2), we get det(AB) = (det A)(det B 
We now give a slightly different proof. 


Consider 


= det A det(z), Bes el - det(Ar, i Sorel dr..), 


where x, € 8". Then 
1) ois linear in each z;. This is easily seen since det is linear in each f 


; 
its variables and A 


is linear. For instance, 


Zn) — det(A(z) +2}),... ,Az,_) 
,I,) + det Adet(z,,... ,2n)) 
— (det(Az),..., Az) + det(Ar!. ;Ax%)) 
In) — det( Ati cg ATel 


= det Adet(z,....2n 


+ det Adet(z},.-. Zn) — det(Az},... ,Az_) 
= (2),22 In) +O(Z;,.22 In) 


(9) dle......s€:,) = 0, where €;,,-..,€,, are any n vectors from the 
ar 1 basis vectors. For, if fi =e, for j # k, then both 
Ae, is Otherwise, {¢;,,...,¢, } 
® Le 

c. Henc 


; 
fa?) 
a 
ft 
3. 
= 
B 
2, 
by 
A 
fi: 


6(B;,.-.,Ba) = ¢ "ow 
= eee ee On Pes,» €;_) 
i — F 
= oe | 


. .) = jet A det(B;,.-- .B) —det(AB,,... AB ) 
sot Adet B - det(AB) 


Scanned by CamScanner 


Results on Determinants 


Basic 149 
gnce det(AB) = det(AB(e1),..-,AB(en)) = det(ABy,..., AB,). 
O 


Corollary 6.4.2 Let A,X € M(n,R). Let A be invertible. Then we have 
(1) det(A“*) = det(A)7?. 
(2) det(AN.A7') = det(X). 


proof Since AA~! = I, the identity, (1) follows from Theorem 


6.4.1 si 
get(I) = 1. (2) follows from this and Theorem 6.4.1, 4.1 since 


Q 
Theorem 6.4.3 Let A € M(n,R). Then 
(1) A ts invertible if and only if the columns A; of A ere linsbierde 


pendent. 


inde- 


(2) det A=0 tf and only if columns of A are linearly dependent. 
(3) det A + 0 if and only if A ts invertible. 
Proof Let the columns «A, of A be linearly independent. Then 
{A; = Ae; | 1<i¢n} 


is 3 basis of R” where {¢,}{L, is the standard basis of R". Therefore there 
exist scalars i; such that, e; = pw Si;Aj. We claim that B = (Sij) is the 
inverse of A: 


a= So SA = Do Sude;s = 25 (x one] 7 Py [x sa] * 
j 2 2 k « 3 


Since {¢,} is a basis, by uniqueness, we see that }>. 8,05. = dj and hence 
she claim. Therefore A is invertible with B as its inverse. 

Conversely, if A is invertible, then 4 maps a basis of R™ to another basis. 
But A; = Ae: and hence {A,} is a basis of R". This proves (1). 

From the properties of the determinant we know that if the columns of 
4 are linearly dependent, det A = 0. Conversely, assume that det 4 = 0 
and suppose columns of A are linearly independent. From (1) we know 
that A has an inverse B. Therefore 1 = det J = det(48) = det Adet B. It 
follows that det 4 = 0, a contradiction. Hence the columns of A are linearly 
dependent. This proves (2). 

(3) is an immediate consequence of the first two assertions: 

det A = 0 if and only if the columns A; are linearly independent (by the 
second assertion) which is true if and only if A is invertible. 


= 


a. 
Scanned by CamScanner 


Theorem 6.4.4 Let A € M(n,R). Then Az = 0 has a nonzero solution 
ré€ R" if and only if det A = 0. 


Proof Suppose Ar = 0 has a nonzero solution 


Z 
Ty, 
Then 
Ty 0 
(AjA2-:*An) | : | = 
Ln 0 


if and only if (_; £)Aj = 0. This happens if and only if A;’s are linearly 
dependent which is if and only if det A = 0, by Theorem 6.4.3, 

0 
Definition 6.4.1 Let T: V + V bea linear map. We fix a basis {uj} 
of V. We define detT := det .M'(T). Is this well-defined? That is, if 
we choose another basis {u,,...,Un} of V and set detT := det M(T) 
we need to show that M!(T) = M&(T). This is an easy consequence of 
Corollary 6.4.2. The matrices are related by a conjugation by an invertible 
matrix taking one of these bases to the other. (Exercise: Work out the 
details.) 


Corollary 6.4.5 T: V — V has a nonzero kernel if and only if det T := Q 
Proof This is an easy consequence of Theorem 6.4.4. 


0 
Lemma 6.4.6 /f A‘ denotes the transpose of the matriz A, det A = det At 


Proof Let A = (aj;) and let o be a permutation. Since 
Q19(1)@20(2)°** Ano(n) = Qg-1(1)1A%e~-1(2)2°** Ag-1(n)n- 


Note that o~! runs through S,, as o varies and that sign (c¢) = sign (0-1) 


Hence 
det A: = Ss sign (0 )@1.0(1)@29(2) *** Qno(n) 
cES, 
. -1 

ai > sign (a )Oq~1(1)1%g~1(2)2***Ag~1(n)q 
o~'€S, 

= oF sign (r)as(i)1 "**OAr(n)n 

= det At. 


Scanned by CamScanner 


Basic Results on Determinants 151 


6.4.1 Laplace Expansion 


Laplace expansion shows how to reduce the evaluation of the determinant 
of an n x n matrix to that of an (n — 1) x (n— 1) matrix. 

Let A = (aig)isijgn be given. Fix i. Let us write the ith row R, as 
R=aner te + Qinén. We expand 


det Ry,+-> Rina, Y, o4j@j) Rigas. he 
j=l 


= n 
S53 det(R,. we pn C4) Haas 00 » Rn) = yo aij Ah, Say. 
j=1 


j=l 


det A 


iH} 


Now, if we can subtract e; from any of the Ry, det A remains unchanged. 
Thus Aj; is the determinant of the matrix obtained from A by changing 
the entry aij to 1, and all other entries in the ith row and the jth column 
to 0. Laplace’s expansion says that Aj, is (upto sign) the determinant of 
the (n — 1)x(n- 1) matrix obtained from A by deleting the ith row and 
jth column. 


Definition 6.4.2 Let Aj; denote the (n — 1) square matrix obtained by 
deleting the ith row and the jth column. det Aj; is called the minor of Qi; 
of A and the cofactor Cj; is by definition Ci; := (—1)'*) det Ajj. 

Theorem 6.4.7 (Laplace Expansion) For ann xn matriz A = (ai;) 


’ 


det A = aii Ci + ai2Ci2 + +++ + QinCin, (6.4.3) 


where Cij is the cofactor of aij. 
Proof Recall the explicit expression for det A: 


det A = = SIgN (7) 019(1)Q29(2) *** Qno(n)- 
o€S,, 


Fach term in the right side above contains exactly one element from the ith 
row (Gi1y-+* > ain), of A. Hence we can write det A = Qi A}, +: + QinAt, 
Then Aj, is the sum of the terms having no entry from the ith row. We 
need only show Aj; = (-1)'% det Ajj. 

Let us first look at a special case where i = n and j = n. Then 


G@anAnn = Ann 5; sign (7 )a16(1) "** An-1e(n-1) 


where the sum is over all permutations o € S which leaves n fixed: o(n) =n. 
But this is the same as summing over all permutations from $,_; so that 


A®,, = det Ann = (-1)"7" det Ann. 


Scanned by CamScanner 


152 _ Determinars, 


, 


bring the ith sow to the nth sop ; 


y 
; 
a a Contact 5 
SUCCESZINZ sows. OMNIA. the fet 
a . 
‘ oo at P P . 
* . . ter , rrp er, 
STI. 152 1b PT OCEES, OUBET VE fires ae fh . 
F te A ‘ P 4 
- + . <s clerer fi s 
y ORS O2 ABer rows 200 COUInRS CO RA ges heer, 
f° eS, 
. 2 - ; an } 
D+ «3 Md fg » £ land hb etae -f A? * a) = 
Dot the Sgn of Gel A \42nG Hence that Gi fogs) 35 Cizngec of) 
= 
4 
(4 _=—t st a—J = (—] t~j 
oe at j a 
or ._ 40 _ 91644 2 ‘ 
SOE AA, = 1-3) 7 GA Aj;. 
ij j 
pa 
= 
a aouca, 2 Preeetiurm (6 44 «x bacon 2 the exnenss bey of 
) ne expansion 10 Equation 94-5) % Choma cf the x pemsion OY the cts 
; 3 - BR 
ore Coomilerie are can errand del A) ber Ht 7th coleemn 
to SILL Zry ODS C2b eT pene C&. 4) UF ts Joe Onusnin 


“- - 


3 4 ar toner as eo vy ‘Beh = * ‘ 
Exercise 6.4.1 As an immediate exercise, prove the Laplace expensing ty, 
’ 


pa: Se eee 
Bf 720 Cosme 
AAs 34), dot Ass ee oect(~] "Ios d 
CAAR='- ip Aly 4) O93 CA Ags. 
J 
Bn KAT ecslit« many AGtvidende 
b peorern YAL/ Few Mansy Civ Moesaxae. 


Proposition 64.8 If A=(a.;) uw ann *n matriz, then 


\ : — 
| 4: Ce ~ Our = +++ = HieCen = fj uitk (644 


| = datAifi=k (645 


‘ 
mh 


ll 
© 
. 
my 


j#k (6458) 
= dtA fj=k. (647) 


. ss t. omstecr  rivained J te, e 
Proof Consider the matrix B ottainec trom A by replacing the ith Ione 
, 2) 


‘ 4 L ys — fa t = 
le of A ty its Ith row. Wit = &, then B = A and the result follows from, 


- — ' i hee TI 49, 

Pauxtion ‘64 y if nx. cH. B = J Ty i Be0rem §.43 AIS Lemma 646) 
er ae s > f. s 

The th row of Bis (0,;.... Gyr, and the colactors are Agy,... Age. We 


c 4) to expand B ty its kth row to get 


F 3 

- nat ; 

RE RUZ (9A) © Zs 
; 


rs 
rr 
ad 
rey 
~ 
. 
~ 
Sas 
rm 
& 
ou 
= 
hens 
pons 
bm | 
vs 
s 
N 
os 
‘ 
. 
R 
x» 
N 
~~ 
. 
hoe 
‘ 


O 


Let us write Equations (644) and (6.4.5) in 2 compact form. 


j 


(648) 


Scanned by CamScanner 


a 


Basic Results on Determinants 153 
b= es 


cis eads us to Gefine 2 new matrix A whose (jh )th entry is (-1)**? det Axy- 


wins Equation (6.44) says that the product AA is (det A)I. Using the 
—_— versions of Equations (6.4.6) and (6.4.7). we get 


AA = cer Al. 


new matrix A is called the adjunct of A. It i 


ros ca is denoted by 2dj (A). 
nie greed naming it BOjUint, 2s there is another more widely used Cone 
eos ziso celled ecjoit.) 
bo important comsequence of this is thet we get 2 formule for the inverse 
e &. 
ge Ao we i Z 
aa A 


09 -1 3) 
As Z 5 —4 
—3 7 |i 


Au = 33 Ai = -—16 An = % 
An = —i An = 6 An = 2. 


| 190 29 
(1A; =| 2 9 3). 
6 2 


; ots 
ies transj 33 22 -41 
(-1)'An=(10 9 6). 
mo 3 2 


Let us compute the determinant of A: We use the standard n 


; i" : Otation such 
on Ce aC, means that a times the jth column C; is added to the ith 
cofomn. 


6-1 3 | 0 -1 9 
A=i 2 5 -4 “eS 2 5 ik 
—3 «7 1 -3 2} 22 


| 


Scanned by CamScanner 


154 


on ee CF SECT Oe. Deto 
——Stlnanty 


We now expand the matrix on the right side by the first row, We 


det A = (-1)'*?(-1)(44 + 93) = 77, 


33 22 -1l 
Hencewe get A? = 4 {10 9 6]. 


Bet 


29 3 2 


6.4.2 Cramer’s Rule 


Consider a system of three equations in three unknowns, Az = § wh 
’ ere 


Qy, Gy2 443 by zy 
A= Q9;} Q22 293 a) = bo and z = Z2). 
43, 432 433 b3 zy 


We assume det A + 0. Thus we want to solve 


Qy; 432 33 Ty by 
Q2, @22 @23] | 22] =| 2]. 
a3, 432 433/ \Z3 b3 


Let 


fori = 1,2,3. Then 

z, det(Ay,A2,A3) = det(r1A1, A2, As) 
det (zr; Ay +29A2+ 2343, Ao, A3) 
det(b, Ao, As). 


Therefore, 
_ det(b, Ag, A3) 


zt} = 
det A 
This proof can obviously be generalized for any n. We have thus the 
following theorem: 


Theorem 6.4.9 (Cramer’s Rule) Let A€ M(n,R) with det A 40. Let 
bE R" be a column vector. Then the solution of Ax = b is given by 


Wp tts det(Ai,... j Oysa »An) 
al det A 


where b is in the jth place and det A = det(Aj,...,An) where A; is the 
ith column of A.“ 


0 


————— a Ll i 


Scanned by CamScanner 


Basic Results on Determinants 155 


Proof 2. We want to solve Ax = b, where A is ann xn matrix, b is a fixed 


column vector. Write A = (Aj,...,An), where Aj is the ith column. Set 
Kg = (iy hy Gyalyeptaasy+0% 4p) 
where J = (€1,--- »€n)- Then 
zy. = det xX, 


= det (€1,.-+ seh-1s > Fkeks Cke1s--+ sn) 

= det(A~'AX;,) — (since det(AB) = det A det B) 
det(AX,)/detA (since det(A~') = (det A)~?) 
= det(Ay,..-» Ag—1,0, Ageia, «+ »An)/ det A 


since AX, = (Aei,..- , Aex—1, AZ, Aexgi,... Aen). 


0 
Example 6.4.2 Let us solve the system 
r+y = 0 
ytz = 1 
z2+2 = I, 
This can easily be solved. From the first equation, we see that z = -y. 


Substituting this value of y in the second equation, we get z — z = 1. This 
along with z + y = 0 gives us z = 0, z = —1 and y = 1. However, we shall 
solve this using Cramer’s rule. The coefficient matrix A is 


1 10 
A={0 11 
ro. 


Let us write the system as matrix equation Az = b where 


We compute the determinant of A: 


1 1 0 


C- > 0 0 
101 1-1 1 
det(A’) = (-1)'*12=2+0. 


——— 
Scanned by CamScanner 


156 Determinants 


Now we can apply Cramer’s rule. Let B; denote the matrix obtained fror, 
A by replacing the ith column by the column vector b. Then we have 


010 
B= if 1 1 ° 
-1 01 


Hence det B, = (—1)!*?2 = -2. 


11.50 . 0 6 
Belt 1 1) SS Bele 1 Th. 
io 4 ae ee | 


Hence det By, = (-1)!*10 = 0. Thus the solution is given by 


_ det By ~ 
7 GetA 
_ det By _ 
y= GetA 
ees det Bs 0 
“ ~~ det A 
Exercise 6.4.2 Let | oe 
A=|-3 41]. 
4 -4 5 


(1) Find the adjunct A of A. 
(2) Compute det A, 
(3) Show that AA = det Al. 


Exercise 6.4.3 Compute the inverse of the following matrices if they exist: 


422 40 2 
(i) f 1 ) (ii) f 3 ) 
103 01 -2 


3xercise 6.4.4 Compute the determinant of 


12s 4 
210 -l 
204 2 
731 -1 


Scanned by CamScanner 


Basic Results on Determinants 157 


exercise 6.4.5 Compute the determinant of 


l-cz 1 1 
1 l-z | 
l l l-z 


Can you generalize this? 


Exercise 6.4.6 Find the inverse of the following matrices: 


Ll 2 2 224] 
(@) {3 1 0). Gi) [1 21 
| a re | 11 
Exercise 6.4.7 Solve the system of equations: 
2r+y = ‘ 
3y +z =" i 
4z+2 = 2. 


Exercise 6.4.8 True or false: det(A + B) = det(A) + det(B)? 
Exercise 6.4.9 For what values of r we have det(aA) = a’ det A? 


Exercise 6.4.10 (Block-diagonal matrix) Consider a matrix of the form 


c= ($8) 


where A and B are square matrices and each 0 denotes a matrix of zeroes 


C is called a block-diagonal matrix with two diagonal blocks A and B. Now 
det C = det Adet B as 


A \0\ (A O\/I 0 
0 BB) \o r)\o B (6.4.9) 
where J is the identity matrix. Consider the function: 


A) = det G 5) | 


f satisfies the conditions (1) and (2) of Theorem 6.4.3. Therefore, 


det é 3) eee 


det é ) = det B. 


Since det(S7’) = det(S) det(T) the result follows from Equation (6.4.9). 
ise 6.4.11 If | 
Exercis pe AD | 
“\o B | 

{| 

I | 


Scanned by CamScanner 


Similarly, we get 


\then, det C = det A det B. 


eR RS 


158 Determinant, 


6.4.3 Some Geometric Ideas 


Let A = (a,;) € M(n,R). det A can be thought of in two ways: 
(1) As in the definition, det(A) = Yes, sign (7)a10(1) +++ Gno(n). 


(2) Since A; = Aej, det(Aj, A2,... , An) stands for the ‘signed’ Volume 
of the image of the unit cube nde the linear map A. In this case We 
can think of det A as the distortion factor for the volume under 4 of 

the volume in the domain space. 


This geometric way of looking at the determinant is quite useful j in 
differential geometry. Let us look at some examples. 


Lemma 6.4.10 Let {v;}', be a basis for R". Let v; = yy Ujie;, where 
{e;}ja1 ts the standard basis for R". Then 


vol ((vj,... ,Un]}) := det(4,... Un) = det((vj, v;)). 


Proof Let A be the linear operator which takes e; to uj. Then its matrix 
with respect to {e;} is given by A =[v1,... , Up], where 


We have 

(U1,t1) 66. (U4, Up) 

t 

det(A‘A) det( [u,. ve Yn] [u,: : “Un]) = det 
(Uns) +++ (Ups ty) 


That is, (det A)? = det((vi,¥;)), 1 < i,j <n. Therefore 


vol ((U1)+++ »Un]) = [det((vi, 3) |}. 


0 


Remark 6.4.1 Let V be an inner product space and {e,... »€n} be an 
orthonormal basis of V. If the basis {u,...,un} is orthogonal, then the 
parallelepiped (uy,... ,U,]} is ‘rectangular’. From Lemma 6.4. 10, we see 


that 
det(ui,... Un)? = [let 


Scanned by CamScanner 


Basic Results on Determinants 159 
i 


This agrees with our intuition, namely the volume of a rectangular 
parallelepiped is the product of the lengths of its sides. 

If {v1,-.+ »Un} is a basis of V, then by Gram-Schmidt process we get an 
orthogonal basis {u;,...,Un}. Let A =(v,,... ,n) where v; is thought of 


as a column 
Vii 


vy = 
Uni 
Since u; is a linear combination of v; for 1 < j < i, it follows that if we 
replace ui by ui then det(v1,... , Un) = det(uj,...,un). This corresponds 
to the geometric idea of cutting the parallelepiped into pieces and 


rearranging them to get a rectangular parallelepiped as was done in the 
case of a parallelogram. 


Remark 6.4.2 If we think of elements of R” as column vectors then the 
dot product on R” can be written as (x, y) = y'z, the matrix multiplication 
of | x n matrix by n x 1 matrix. A 1 x 1 matrix (a) is thought of as the 
real number @. 


Lemma 6.4.11 Let A € M(n,R) be given. Let {uj}, be a basis of R", 
Assume that we are given (Avj,v;) for alll <i,j <n. Then 
del) <2 det((Avi, vj)) 
det((v;, v;)) 


Proof Let B be the matrix which takes e; to vj. (Recall that the ith 
column of B is the column vector v;.) Now, we have 


(Aun) = vps = (Bes) ABE: = BABE, = (BEABe,,6)). 


Hence 
det((Avj, v;)) = det((B' ABe;,e;)) 
= det(Bt AB) 


(6.4.10) 
= det(B*) det(A) det(B). 


Again, 
det(B'B) = det((B'Be,,e;) 
= det((Be;, Be;)) 


= det((v;, v;)). my) 


The result follows from Equations (6.4.10) and (6.4.11), 


| 
Scanned by CamScanner 


160 ___Determinan,, 


6.5 Orientation and Vector Product 


This section deals with two more uses of determinants, which m 


a 
skipped in the first reading. ¥ be 


6.5.1 Orientation 


Let V be any vector space, j € V, 1 ¢ t <n. We have emphasize 
that we may think of |det(v;,... ,Un)| as the volume of the parallelepineg 
[v1,... Un]. There is another use of determinants which employs the 5; 
of the determinant. In R? and R® we have notions of orientation. To talk 
orientation in higher dimensions is quite unintuitive unless based on some 
mathematical concept. 

We fix a basis {e1,...,¢n} of V. Let {v,...,%n} be another basic 
Let A be the matrix which takes e; to uj. Note that det(A) 4 0, thanks 
to Theorem 6.4.3. We say that {vj,...,Un} is positively (or negatively) 
oriented if det(A) > 0 (respectively if det(A) < 0). Note that the order in 
the listing of the basis is important, for instance, the basis {e2, €1,¢3,... ,¢ 
is negatively oriented. . 

One can see that this agrees with our intuition in the case of R? and R3 
where the fixed basis is taken as the standard basis. 

One usually thinks of det(v,...,Un) as the oriented volume of the 
parallelepiped [v;,..- , Un]. 


Remark 6.5.1 More abstractly, given two ordered bases By and Bp of a 
real vector space V, the unique linear isomorphism of V which takes B, to 
By (preserving the order) has nonzero determinant. We say that B, and B, 
are equivalent if this determinant is positive. Thus, the set of ordered bases 
B of a real vector space V is the disjoint union of two subsets B, and B, 
where the nonsingular transformation which takes one basis of B; to another 
in B; has positive determinant whereas the linear isomorphism which takes 
one basis, say, from B, to another in By has negative determinant. An 
orientation of V is nothing other than declaring one of B; to be the set of 
positive bases of V and call the other as the set of negative bases. 


6.5.2 Vector Product 


We define a cross product on a three-dimensional real vector space V with an 
inner product: (x,y) ++ (z,y). We fix an orthonormal basis {e;} of V such 
that (e;,e)) = 6i;. If you wish you may take V = R® with the standard basis 
vectors and the Euclidean inner product (z,y) 4 (z,y) := Se Ziyi. We 
also have the Riesz representation theorem: For any linear map f : V 4R 
there exists a unique u € V such that f(z) = (z,u). Hint: With basis 


| 
Scanned by CamScanner 


Orientation and Vector Product 161 


vectors ¢ We take u := 5°; f(ei)e;. We now define the cross product or 
yector product on V as follows: 


for x,y € V, the map z+ det(z, y, z) is linear map of V to R and hence 
py Riesz representation theorem (Theorem 5.7.1) there exists a unique 
vector v such that (u,z) = det(z,y,z), for all z € V. We denote this 


vector v by TXY and call it the cross product or the vector product of x 
and y. Let us record this defining property of z x y: 


(z,z x y) :=det(z,y,z), forallzeV. 


Let us find the coordinates of x x y with respect to the orthonormal basis 
{ei}: If we write z X y = )>; wie, then 


wpe(tXyer) = (e127 xy) 
= det(z, y,e;) 
det(z1e, + r2e2 + 23e3, yre1 + yen + Y3€3, €1) 
det(z2e2 + r3€3, y2e2 + y3e3, 1) 
= det(z2e2, y3e3,€,) + det(z3e3 + yreo, €1) 
= x2y3 det(e2, €3,€1) + r3yp det(e3, €2, €;) 
= Z2Y3 — L3Y2- 


One similarly finds that w2 = z3y, — z1y3 and w; 


= 2192 — Toy}. 
Thus 
rxy = (zoy3 — Tay2)ei — (z3y1 — T1Y3)e2 + (ri42 - Z2y;)e3 
T2Y3 — T3Y2 
= T3Y1 — T1Y3 
T1Y2 — T2Y1 


This is the familiar expression for z x y in vector analysis. 
Lemma 6.5.1 The vector product has the following properties: 
(1) 2x is orthogonal to x and y. 
(2) ye xy = A(z xy) = =X Ay, forA ER. 
() yxe= EXD 
(4) 2x¥= 0 if and only if x and y are linearly dependent, 
(5) (2x y,z) = (yX z,z) = (zx 2,y). 


(6) (z,y ¥ 2) = (y,z x 2) = (z,0 xy). 


Scanned by CamScanner 


162 


Proof These are immediate consequences of the properties of determ; 

We give a sample of the arguments. Let us prove the first assertion v 

instance, to prove that (r,z x y) = 0, we have by the very definiti ‘ 

vector product oi 
(z,z X y) = det(z, y,z) = 0, 

since two terms are equal in the determinant. The rest of the assertio 


ae n 
on similar lines and we leave them to the reader. 80 


Q 


Proposition 6.5.2 For any three vectors x,y,z € V, we have 
x x (y x z) = ((2,2))y — ((2,9))2. (6.5.1) 


Proof To show that these two vectors are equal, it is enough to show 
that their inner product with any vector of V (in fact, any vector in an 


orthonormal basis) are the same: 
(v,z x (y x z)) = (vs ((z,2))y ~ ((2,9))2). 
In view of (4), it is enough to verify for an arbitrary vector v, 
(v x 2,y x 2) = (v,y) (2,2) — (2,9) (v5 2). (6.5.2) 


We first observe that both sides are linear in each of the variables. Hence it 
is enough to verify it on {e;}. Due to symmetry we may take y = e,, 
z = e€ so that y x z = €3. Now it is easily checked that both sides 


of Equation (6.5.2) are equal to (v.22 - U2). 
0 


Note that the vector z x y is the vector orthogonal to z and y. If z and 
y are linearly independent, then z x y # 0. In fact, from Equation (6.5.2), 
it follows that ||z x y||’ = (z,z) (y,y) - (z, y)’, which is the square of the 
area of the parallelogram spanned by z and y. Thus, {z,y,z x y} is a 
basis of V. Also, we claim that this has the same orientation as the basis 
{e1, €2,€3}- To show this we need to show that if A is the matrix such that 
Ae; = 2, Aez = y and Ae = 2 x y, then det(A) > 0. The matrix A has as 
its columns z, y and z x y: A= (z,y,z x y), where z etc. are thought of 
column vector with respect to the basis {e;}. Now, 


det(A) = det(z,y,2 xy) = (cx y,2 xy) >0, 


since z x y #0. | 
The geometric meaning of the vector or cross product z x y is that it 
i 


is the vector orthogonal to z and y with the Property that {z,y, 2 x } 
vy y 


— | 
Scanned by CamScanner 


Orientation and Vector Product 163 


is a basis with the same orientation as {¢1,€2,e3} and is of length equal 
to the area of the parallelogram spanned by z and y. It may be noted 
that the length is ||z||||y||sin@, where @ is the angle between z and y (sce 


Section 6.1). 


Scanned by CamScanner 


7. Diagonalization 


The simplest linear maps from a vector space V to itself are aJ, for q ER 
Next come the linear maps of the form v,; ++ a;v; where {v;} is a basis of V 
and a; € R, 1 <i <n. If we write the matrix of these maps with respect 
to this basis, it is of the form (a;;) where 


+ ifiéj 
aij = 


aj ifi=j 


is a diagonal matrix. We denote this diagonal matrix by diag (a1,... ,a,), 
The main theme of this chapter is to prove that if we are given a symmetric 
nxn matrix, then we can find an orthonormal basis {v;} of R" and scalars 
a; € R such that Av; = ajv; for 1 <i <n. 


7.1 Rotation of Axes of Conics 


Let us start by reviewing the trick of rotation of axes so that a conic given by 
the quadratic expression az? + 2hzy + by? is written in one of the standard 
forms of an ellipse, hyperbola or parabola. 


Let F 
a 
a-(¢ 4) 


Then the given quadratic expression can be expressed as 


2 ‘a Ce ; 
az’ + 2hry+by’=(z y) f ) (): 


A is called the coefficient matriz of the homogeneous quadratic polynomial. 
The rotation by the angle 4 in the anticlockwise direction is given by 


_ (cos? —sing 
a aa 
164 


Scanned by CamScanner 


Rotation of Axes of Conics 


165 
if we effect the coordinate transformation 


then the quadratic expression becomes 


(*) + (u,»)RARg (*) | 


Hence the coefficient matrix with respect to the new coordinates 


ReARg — R_»gARg = (? : 
q T 


(u,v) is 


where 


q = h(cos? 6 — sin? ) — (a — b) sin @ cos 6 
and hence 


2q = 2hcos 26 — (a — b) sin 26. 
We can make g = 0 by taking 8 so that tan29= "if a+b,or 
a- 
9= + ifa=bandh¥0 
0 ifh=0. 


We thus have the quadratic expression as (u,v) + pu? + rv?. 


In geometric terms, p and r are the principal (major/minor) axes of the 
conic. Note that if v is a point of the conic Q, then 


6 


is orthogonal to the tangent space T,Q if and only if Av = pv or rv. For, 
the tangent line is given by 
(hi,h2)A 9) =0 


that is, by (Az)+. Hence v € Q is orthogonal to (Av)+ or v is a scalar 
multiple of Av. This introduces the notion ‘of eigenvector. 
eo) 


Definition 7.1.1 Let V be a vector space, A: VS-+_V be linear. We say 


that a nonzero uv € V is an eigenvector for A if there existsa € R such that 
Av = av. ais called an eigenvalue. 


— ———_— 


~ Scanned by CamScanner 


166 Diagonalization 


Example TL Assume that there exists basis {u;}"., such that the 
matrix of A with respect to this basis is diagonal: My A = diag (ay, 


‘ ar *Qy), 
Then v; is an eigenvector of A with eigenvalue aj. ) 


0 1 
AS ; 
1 0 
Then v, = e, +€2 and v2 =e; — ep are eigenvectors with eigenvalues | and 
-| respectively. 


Example 7.1.2 Let 


Assume that there exists a € R and a nonzero v € V such that Av = gy 
Then (4—al)v = 0. By Theorem 6.44, det(A - al) = 0. Conversely i 
det(A- al) = 0 for some a € R, then again by Theorem 6.4.4, there exists 
anonzero v € V such that (A-al)v = 0 or Av = av. Thus finding of (real) 
eigenvalues is equivalent to finding the real roots of the polynomial equation 
det(A - zI) = 0. The polynomial det(A - 2!) is called the characteristic 
polynomial of A. 

We shall concentrate on R? in the rest of the section. We shall identify 


any linear map A: R? + R? with its matrix M(A) with respect to the 
standard basis of R?. Let 


Then the characteristic polynomial y4 of A is given by 


det(A - 21) = 2? - (a + d)x + (ad - be) = 2 - tr(A) + det(A). 


Now this quadratic polynomial has real roots if and only if its discriminant 
“b? — 4ac” is non-negative. 


We now assume that the matrix A is symmetric, say, 


() 


Then the characteristic polynomial is z? - (a + b)z + (ab - h?). Its 
discriminant is (a+b)? -4(ab-h)? = (a-b)?+h? > 0. Thus, a Senne 
matrix of order 2 has real eigenvalues. 

We shall redo the earlier formula for the angle of a rotation that brings 


the matrix into diagonal form in a slightly different way which will i 
alize to higher dimensions. gener- 


Let us consider the map f: (0,2z] + R given by 


(Gl) (2) ee ae 


sint 


Scanned by CamScanner 


Eigenvalues and Eigenvectors ie 


the dot product of Av and v where v is the column vector which is on th 
unit circle in R?. A computation yields 


f(t) =acos’t + 2hsin 2t + bsin? t. 


Since f is clearly a continuous real valued function on the closed 
bounded interval (0, 27], it attains its maximum and minimum in (0 and 
Since f(0) = (27) both these extremum cannot be at the end | a 
these intervals (unless f is a constant). If f is a constant, all points of 
the open interval are points of extremum for f. Thus we may — in 
ona ;' cme s a maximum occurs at @ in the open stenaiih ar 
en = 0. We find f’(t) = (b-a)si 127), 
equation f(t) = (b— a) sin 2t + 2h cos 2t. So 6 satisfies the 


(b — a) sin 26 + 2hcos 26 = 0 or tan26 = Be 
b 


provided a # b. Ifa = 6, then f'(6) = 0 implies that we may take @ = x/4 


Exercise 7.1.1 Investigate the case when f is a constant 


7.2 Eigenvalues and Eigenvectors 


V denotes a finite dimensional vector space over R. Let A: V 4 V b 
:Vo ‘ 


linear. We fix a basis {e),... ,e,} of 
_ denote the matrix M¢(A). 1€n} of V. We use the same symbol A to 


Definition 7.2.1 We say a real number a is an eigenvalue of A if there | 


exists a nonzero vector v € V such that Av = av. Any nonzero vector 


u € V such that Au = Av is called ; 
an 
the eigenvalue 4. eigenvector of A corresponding to 


Exercise 7.2.1 Ifv is an eigenvector with ei 
‘ ; eigenvalue a : : | 
vector with eigenvalue A. In fact, the Hm ad then Av is an eigen- 


Vg :={zeEV | Az = Bz} 
is a vector subspace of V for any BER. (You sts tas 
‘ can prove th 
can you think of a proof which uses some earlier result?) is directly. But 


The central problem is to find whether there is a basis ied ok ¥ 
consisting of eigenvectors of A: Au; = ajv;. We call such a — - : 
eigen basis of V. If there is no confusion, we shall simply say an eigen ont 
of V. 
If {v;} is an eigen basis of V, then M?(A) is diagonal: 


M?(A) = diag (ay,... (ital: 


7p 
Scanned by CamScanner 


) & so Fe 5 a 
= —_— +" - 
- —— wwe ooo ? 7 — - 
Let atid Se abe ntee — Po 7 MM LA5 1g 
———- 
pene ts 
- - -——_ = 
—— De ee oe 
7 Tels “1s pele 
a 


—--r “ez orem «+ 2 
ne as > = 
Tt z - sisleeims + = 

‘or - 


é AL SUSE oie 5 


f 
erie wt 2 
rey wa 
- 44 Sr Ais oer, 
CL rAmMIGUe weed SOD Ae SEA 
> es 
ys YJ 
ee > 
on 


Scanned by CamScanner 


Bicenvalues and Eigenvectors 169 
————— 


The characteristic equation of A is det{A —- DJ) =@ That x. 


—A 0 2 


= . - - 
0 2-3 v » =U 
5 1 2 
- J o— 7 
Om wcar, ve 
= = _ - n~ —n a 
—A v a2 2) eo 2-A - 
— | —_— |} a _ } 
t 2_} ~ 2} “? 2_-xN ~©20 Cl” 
uJ wn” a Oa am - CA 
1¢ 2 > os + =, 2c - 
— +2 — 3 3-—Al—Z —Fi? — >} = 4 
anvars - - 
9—-VAWS-3)-f = 
a > e 
ee SL Ne = » 
= a 


Senet arenes Se = 1-1 ti Arbab eee eo 
tht <> & the Saree, Por 0 = - lh eee wee eo 


—_—— 
S— ps pandegreret oes 


| bela  iaab am 
ee er le 
is 8 =} zi 


é 
» 


" 
| 
ty 
| 
an) 


\y 
\ 
er 
| 
{ ‘ 


Tims we boos. im Sec. 2 Se eq S = thee we oe ee 
che vecuce r 


— —_ SS 


Scanned by CamScanner 


170 Diagonalization 


A unit vector perpendicular to this in the rz-plane is given by 


1/5 
0 
-2/V5 


This is an eigenvector with eigenvalue 4, as can be easily verified. 
Definition 7.2.2 A linear map A: V > V is said to be diagonalizable if 


there exists an A-eigen basis of V. Equivalently, if there exists a basis {vj} 
of V with respect to which M*(A) is a diagonal matrix. 


Example 7.2.4 Reflections in R? are diagonalizable. Let R denote the 
reflection with respect to z-axis. Then R(z,y) = (x, —-y) so that Re, = e, 
and Rey = -e¢2. Thus the matrix of R with respect to the standard basis 


(i 1): 


which is already in the diagonal form. 


More generally, let 
cos 20 sin 20 
Ry = (@ 20 ~—cos i 


denote the reflection with respect to the line R(cos 0,sin@). Then Rg is a 
diagonalizable. This is an immediate consequence of the last paragraph of 
Section 5.9. However, we repeat the proof in a slightly different form. Rg 
maps the vector (cos 0, sin @) to itself and maps any normal (— sin 0, cos 0), 
say, to the line to its negative. Thus, the basis 


ied cos 0 ah cshieil — sind 
re" \sind) CN cos 


are eigenvectors of Ig. Let T denote the orthogonal transformation which 
takes e; to uj, 1 = 1,2. Then 


Te cos0 sind 
sind = cos 0) ' 


TT  RoTey 
TNT ey 


1S 


Also, as we have 


Hf] 


=] 
1 y=ey 


T™!(-vy) = Cy, 


| 


In matrix notation, we have 


cos 0 sind bee sin 20 ( cosd = sing I 
sind cos) \sin20 con 20) \ ~sing cod) = (3 a 


Scanned by CamScanner 


Eigenvalues and Eigenvectors 71 


Example 7.2.5 Let us investigate when 


a h 
As ( ) 
has a nonzero eigenvector, If it has a nonzero eigenvector it has an elgen- 
vector of unit length. So, we may assume that 


= Ge i) forsome@ER 
sin 6 


js an eigenvector of A with eigenvalue \ € R: Av = Av. This is equivalent 
to the system of linear equations 


acos6+hsin@ = Acosé 
hcos6+bsind = Asin@. 


Let us formally divide the first equation by cos@ and the second by sing 
without worrying about one of them being zero. (Only one of them could 
be zero!) We then get a + htan6@ = d and hcot@ + = 4. Eliminating A 
from these two equations, we get a—b = hcot 20 or cot 26 = at this time 
not worrying about h being zero! If h is zero, A is then already diagonal 
and hence the standard basis is also an eigen basis, 

Now, if sin@ = 0, then cos@ = +1 so that we may take v = e;. In this 
case, working as above we find that \ = a and h = 0, Thus, A {s diagonal. 


Before we go any further, it is important to realize that not all linear 
maps are diagonalizable. 


Example 7.2.6 Consider the linear map A: R? — R? given by Ac; = e2 


and Aeyg = 0. Then 
0 0 
A=(i 4): 


If A has any nonzero eigenvector 


v= 
7] 
way, with eigenvalue 4, we then end up with the system of linear equations 
dr = Oand Ay =z. IEA #0, then c = 0 #0 that y = 0. Thus the 
vector is 0, a contradiction, Note also that we have also shown that the 


only eigenvalue in 0. IfA = 0, then we may take vy = ey, Is there a second 
nonzero eigenvector v2 60 that (vj, v2} is an eigen basis? If 


_ i 
an) 


__—E7_-_-_-_—_—-——_———__ 


Scanned by CamScanner 


ul 


S) ) | 
7 Diagonalizatigy, 


and we have only one (up to scalar multiple) eigenvector and hence there 
is no eigen basis for A. 

We could have made a slick argument using the characteristic polyno. 
mial. Note that the characteristic polynomial of A is X? = 0. Hence the 
eigenvalues are 0 and 0. Thus we need two linearly dependent eigenvectors 
Clearly ¢, is an eigenvector with eigenvalue 0. We look for a second one, I 


z 
v= 
() 
is one such then Av = 0 yields 


Qe 


Thus any eigenvector is a scalar multiple of ey! There is no way we can 
find an eigen basis. 

The purpose of giving a bare-handed approach (the first proof) and a 
more theoretic one (the second one) is to help the reader 


(1) appreciate the power of building a theory, and 
(2) strike his/her own path when there is no theory to work on. 


Let us assume that A: V + V has a (real) eigenvalue \ with a nonzero 
eigenvector v: Av = Av. We can rewrite this as (A - A/)v = 0. By 
Theorem 6.4.4 we know that this happens if and only if det(A - AJ) = 0, 
The crucial observation now is that the left side is a polynomial of degree 
nin A as can be seen from the explicit formula for the determinant. Thus 
any (real) eigenvalue is a real root of the equation det(A - AJ) = 0. 


Definition 7.2.3 Let A € M(n,R). Then the characteristic polynomial 
ya(X) := det(A - X/) where X is an indeterminate. 

Thus a real ) is an eigenvalue of A if and only if it is a real root of the 
characteristic polynomial of A. 


Now the fundamental theorem of algebra tells us that a polynomial 
equation of degree n in one indeterminate has n complex roots. Thus, 
we may have “complex eigenvalues” but no eigenvector in R". 


Example 7.2.7 Let the linear map A: R’ + R? be given by Ae, = ey and 


Ae, = -€). Then 
_(0 -1 
1=(¢ 4) 


It is the rotation in the anticlockwise direction by 7/2. Its characteristic 
polynomial is X? +1 = 0. It has complex roots +V/=1. Thus there are no 
eigenvectors of A in R’, 


Scanned by CamScanner 


Figenvalues and Eigenvectors - 


This example illustrates the problems we may encounter if we want 7 
find an eigen basis for a given A: V 4 V: 


(1) There may not be enough eigenvalues corresponding to a given (real) 
eigenvalue as in Example 7.2.6. 


(2) There may not be any (real) eigenvalue as in Example 7.2.7, 


Lemma 7.2.2 Let A: V — V be linear. Assume that v; is a (nonzero) 
eigenvector of A with eigenvalue a; and that aj # a; for i x j 
1<i,j <r. Then {uj}f_, is a linearly independent set. 


Proof Let us first look at the case r 


2. If vy and vp are linearly 
dependent, then each is a multiple of the other. Assume that v; = Av. 


Now let us operate A on both sides. We get 


Qy,V) = Av, = A(Av2) = AAv2 = Aa2v2 = Q2U\. 


Hence, (a; — a@2)v,; = 0. Since v; # 0, we see that a) = ao. 


Now we wish to generalize this argument to all r by induction. The 


result is true for r = 1, as vy; # 0. Let us assume the result for all r <n-1. 
We shall prove the result for r = n. Assume that 


S- Aid; = ‘ (7.2.1) 
i=] 


We want to show that A; = 0 for all i. Let us operate A on both sides of 
the equation to get 


0= = Ajai. (7.2.2) 


Multiply Equation (7.2.1) by a, and subtract it from Equation (7.2.2). We 
get Srl (aj — Gn)A;vj = 0. Now {v;}"2) is a set of nonzero eigenvec- 
tors with pairwise distinct eigenvalues and hence it is linearly independent 
by induction hypothesis. Thus we conclude that (aj — a,)Aj; = 0 for 
1 <j <n-1. Since aj—Gn # 0 we conclude that A; = 0 for1 <j <n-1. 
Using this in Equation (7.2.1) yields that 4,v, = 0. Since v; ¥ 0, it follows 
that A, = 0. Thus 4 = 0 forl <icn. 


i) 
Remark 7.2.1 The above induction proof may also be rephrased in a 


different way which is useful in certain circumstances. The rephrasing goes 
as follows: Assume that {vi}{.1 is linearly dependent. We look for the 
minimum m such that {vk}fx1 is linearly dependent, say, ry Aku, = 0. 


—— 


Scanned by CamScanner 


- 4 


ae 


—— 


oe ~ — 4 oad 


174 Diagonalizatioy, 


Applying A to both sides of this equation and arguing as above we ded 
that {vj}/%5" is linearly dependent. This contradicts the minimality i a 
thereby establishing the linear independence of {u,}¥_,. m 
Remark 7.2.2 Yet another proof, which is completely different from th 
earlier ones, uses van der Monde's determinant. We apply A*,0< k < ie 
to both sides of Equation (7.2.1) to get ~ 


Ay test At = 
ay Ay vy + OnAnUn = 
ay Ai tit andy = 


I 
oa Oo O&O 


at yyy ag at Un = ff, 


This can be written as a matrix equation 


1 a; aj a}! 

1 a ai . ay! 
(ivy Anta) ' 

1 Qn ay, uy aft 


The square matrix is the van der Monde determinant whose value is 
I (a; - a;) #0 
1$j<ig) 
since aj # a; for i # j. Hence, we conclude that (A;u1,.»- yAnUp) is the 
zero vector. Since v; #0, we deduce that Aj = 0 for all :. 


Exercise 7.2.3 Let A: V + V be linear and A be an eigenvalue of A. Let 
Vy :={veV | Av=Av}. Show that V, is a nonzero vector subspace of 
V. (Vy is called the eigenspace corresponding to the eigenvector 4). 


7.2.1 Cayley-Hamilton Theorem 


Let {(X) = LpeoaeX* be a polynomial in the indeterminate X with real 
coefficients a,. Let A be a square matrix of size n. We then define a new 
matrix, denoted by {(A) by setting 


f(A) :=a,A” +0,-,A"™! tert aAt aol. 
Theorem 7.2.3 (Cayley-Hamilton Theorem) Let A be annxn Square 


matriz. Let . 
f(X) =X" + en-1X" HHL X 4 


-_— 


Scanned by CamScanner 


Figenvalues and Eigenvectors - 


be the characteristic polynomial of A. Then 
f(A) == A" + p-1A™ +++ +0)A +09 =0. 


'Thus, A satisfies its characteristic equation. 
Proof Recall the adjunct adj(B) of B defined in Section 6.4 has the 


property that B adj(B) = det(B)J. We apply this result to the Matrix 
XI-A to get 


(XI — A) adj(XI - A) = det(XI - A)I = f(X)I. (7.2.3) 


Now, adj (XJ — A) is a matrix whose entries are determinants (up to sign) 
of (n — 1) square submatrices of XJ — A. Hence adj(XJ — A) is a matrix 
whose entries are polynomials in X of degree at most n — 1: 


adj (XI — A) = By_,X"~! +-+-4+4 BX + Bo 


where B; are matrices with real entries. Hence Equation (7.2.3) can be 
written as 


(XI — A)(By-1X"7! +--+ + ByX + Bo) =X" 4---40,X +09. (7.2.4) 
Comparing the coefficients of like powers of X, we get 


Bn) = | 
Bn-2 -AB,-1 = Cn—il 
Bn-3- ABn-2 = Cn-2l 


By - AB, = ql 
-ABo = Col. 


Multiplying the first of these equations by A", the second by A"~!, so on, 
the last but one by A and the last one by J, and adding them we get the 
desired result. 
O 

Exercise 7.2.4 What is wrong with the following “proof?” In the 
equation f(X) = det(XJ -— A) put X = A. We then get the result. 
Hint: The equation required to be proved involves matrices whereas you are 
getting a scalar equation. 

There is a class of good linear maps for which we can always find an 


eigen basis. We introduce them next. 


——_ ld _—$—————EoE 


Scanned by CamScanner 


q 


— 


176 Diagonalizati, 


ae a en 


7.3 Diagonalization of Symmetric Matrices 


aa 


eee 


Throughout this secton, we let V cenote a (nite YECLOr spar 
ow ‘momar man 7T- V + V ee 0 be summe} te 
De Snition 3.1 4, linear map J PB sant ve symmeinie if for 


= 


orthonormal basis {e,}"., of V. Let T:¥ 5 y 


BA denote the matrix of T W ‘ith TESPECt to this 


etric if and only if A is a symmetric matrix, that is 


Lemma 7.3.1 L4tT: V + V & a symmetric linear map. Then the eigen. 


a 


,2 with Ay # Az are orthogonal to cach 


ather 


- 
~ 


The above lemma gives vet anot! her prool 0 f Lemma 7.2.2 in the Case of 


a svinmnetric linear map on an inner product space. 
We shall prove in this section that if T is a symmetric linear map, then 


there exists 2 basis of V consisting of eigenvectors of T. That is, there 
exists a basis {v,}"_, of V such that there exist real numbers 4,,1 <i <n, 
with Tu, = Ayv,. Note that with respect to this basis, the matrix A of T 
will be diagonal with the eigenvalues as the diagonal entries. We offer two 
proof fs of this result. Both results use some facts which the reader may not 
be familiar at this stage. The first one is more algebraic while the second 
is more geometric and analytic in character. In both the proofs, the major 
burden is to show the existence of an eigenvalue for a symmetric linear map, 
The result is then completed by induction on the dimension. 
The key idea of the first proof is to use the characteristic polynomial 
det(A-X 1) of A and the fundamental theorem of algebra, whose statement 
we recall, 


Theorem 7.3.2 (Fundamental Theorem of Algebra) Let 
p(X): = a,x i +0,2)X°"' 7 ss +ayX t ay 


be a polynomial with coefficients a, € C. Assume that ” 
a, £0. Then p has a compler root, that is, there exists a comp 
A such that 


> | and that 
ler number 


pla): yd" + ay-\\""! '+QA+ ay =), 


Scanned by CamScanner 


Diagonalization of Symmetric Matrices 177 
a 


We do not prove this result. For a proof, the reader may consult any 
book on complex analysis. 

Thus, by the fundamental theorem of algebra, the characteristic 
polynomial of T has a complex root, say, 4. 


Definition 7.3.2 The characteristic polynomial ofa linear map T: V = VY 
is that of any of its matrix representations. Note that this is well-defined 
in the following sense. If A (respectively B) is the matrix of T with 
respect to the basis {v1,...,U,} (respectively {w,,...,w,}) then A and 

are conjugate: There exists a matrix C such that 4 = CBC~!. Hence 
A- AI and B — XJ are conjugate: A - AJ = C(B - XJ)C—. Hence, their 
determinants are the same, that is, the characteristic polynomials of A and 
B are the same. 


A root of the characteristic equation det(A — XJ) = 0 is called the 
characteristic value of T. The characteristic equation is an invaluable 
too] in our understanding of linear transformations or matrices. Note 
that if a is an eigenvalue of T, then a is a characteristic root, but the 
converse is not true. See Example 7.2.7. However, if a is a real root of the 
characteristic equation of T, then a is an eigenvalue of T. For, this means 
that det(T — aI) = 0. Hence by Theorem 6.4.4, there exists a nonzero 
vector v € V such that (T — aJ)v = 0. That is, v is an eigenvector of T. 


Before we proceed to the main result of this section, let us establish an 
easy result. 


Proposition 7.3.3 If the characteristic equation of T has n distinct real 
roots, then T is diagonalizable. ; 


Proof Let v,...,U, be the eigenvectors corresponding to the n distinct 
roots of A. Then by Lemma 7.2.2, {v;,...,un} forms a basis of V. We 
have already shown that with respect to this basis, the matrix A of T will 
be diagonal. 


0 


Theorem 7.3.4 (Spectral Theorem for Symmetric Linear Maps) 
Let T: V — V be a symmetric linear map on a (finite dimensional real) 
inner product space. Then there ezists an orthonormal basis of V consisting 
of eigenvectors of T. 


The crucial observation towards the proof of this theorem is the fact that 
any characteristic root of a symmetric linear map is real and hence is an 
eigenvalue. 


Lemma 7.3.5 All characteristic roots of a symmetric linear map are real. 
Equivalently, all characteristic roots of a symmetric matriz are real. In 


particular, there 1s an eigenvalue of T. 


————_——_———— ee 


Scanned by CamScanner 


1T8 , 


Proof Let \ bea root of the characteristic polynomial of the gy 
matrix A. Suppose \is not real, that is,Im\20.Wehave MMetrie 


det(A- AI) = 0. 


Therefore det(4- M)(4-AN)} = 0". 
Writing A = Re\ + ilm()), the last relation comes to 


det{(A - Re AZ)? + (Im A)"7] = 0. 


Since this last matrix is real, there exists a nonzero vector z such th 
repped kes ; at 
[(4-ReX)* + (ImA)*/|z = 0 and, in particular, 


{(A-ReAl)? + (ImA)"]] 2,2) =0. 


We get 
((A-ReA)z,(A- ReAl)z) + Im(A)° (z,z) = 0. 
Since the left hand side is positive, this is impossible. 
Q 


Definition 7.3.3 Let T: V + V be any linear map. We say that a vector 
subspace W is invariant under T if Tw € W for all w € W. We also say 
that IW is an invariant subspace of T. 


The second observation needed for the proof of Theorem 7.3.4 is the 
following lemma: 


Lemma 7.3.6 LetT: V 3 V be a symmetric linear map. Assume that W 
is a vector subspace of V invariant under T. Then 


W-={veV | (v,w) =0} 
is also invariant under T. 
Proof This is easily verified. Let v € W+. The result follows from the 
following: 
(Tv, 1} = (w,Tr) = 0. 


The first equality is by the symmetry of T. The second is true, since 
Tw € W (as W is invariant under T) and v € W’+, 


0 


__ 
1We are using here the formula det(AB) = det A. det B for matrices with complex 


entries. 


Scanned by CamScanner 


Diagonalization of Symmetric Matrices 179 
Diagonalization 0! oe 


Proof (of Theorem 7.3.4) We prove the main theorem by induction on 
the dimension of the inner product space. 

Let P, be the statement: If X is an n-dimensional inner product space 
and if F: X > X is a symmetric linear map, then X has an orthonormal 
basis consisting of eigenvectors of F. 

P, is clearly true: For, if V is one dimensional inner product space, let v 
be any nonzero vector in V. Then u := v/||v|| is a unit vector and {u} is 
an orthonormal basis of V. If T: V — V is any linear map, then we already 
know that there exists a real A € R such that Tu = Au (see Example 4.1.10 
and Exercise 4.1.9). Hence P, is true. 

Assume that P,_, is true. Let T: V + V be a symmetric linear map 
on an n-dimensional inner product space. By Lemma 7.3.5, there exists an 
eigenvalue, say, A of T. Let w be an eigenvector of T with eigenvalue 4. 
Then W := Rw is a vector subspace of V invariant under W. Hence, W’~* is 
also an invariant subspace of T. Note that by restricting the inner product 
of V to W+, W+ becomes an inner product space. As W~ is the kernel 
of the linear map fy: VR given by f..(v) := (v,w), by the rank-nullity 


theorem, dim W+ = n—1. The restriction Ty of T to W’ is obviously 
symmetric. Hence, by induction hypothesis, there exists an orthonormal 
basis, say, {t1,...,Un-1} of W consisting of eigenvectors of Ty. Since 


V =W SW‘ is an orthogonal direct sum, the set WVigsssa tacts ty = Ow} 
is an orthonormal basis of V such that each v; is an eigenvector of T. 


Oo 
As a corollary, we obtain the following result. 


Proposition 7.3.7 Let A be a real symmetric n x n matriz. Then there 
exists an orthogonal matriz B such that BAB™ is a diagonal matriz whose 
entries are the eigenvalues of A. 


Proof Let T: R" — R” denote the symmetric map whose matrix with 
respect to the standard basis of R” is A. To wit, 


Then, by the spectral theorem, there exists an orthonormal basis {v;}"_, 
of R® consisting of eigenvectors of T. Let 


bi 


Ujpes= D> bye; = 
J 


bai 


Fe 
Scanned by CamScanner 


Let B = (b,;). Then B is easily seen to be orthogonal. Note that when y 
view B as a linear map on R", then Be; = u; for all i. Hence, 


since 


Bo AB(e,) = Bo Av; = Bo (A,v;) = Ai€i, for all 1, 
0 


As for the second proof of the main theorem, as observed earlier, jt 
suffices to prove the existence of an eigenvalue of T. We warn the reader 
that this proof is quite demanding, as it requires much more background 
in diverse fields such as metric spaces, analysis, calculus etc. However, it ig 
quite worthwhile to learn the proof, as it brings out the interplay between 
the various branches of mathematics and gives a glimpse of the essential 
unity of the subject. 


Lemma 7.3.8 Let V be a finite dimensional inner product space. Let T 
be a symmetric linear map on V. Then T has an eigenvalue. 


Proof Our proof has a simple geometric interpretation. What we are 
going to do is to look for the minor axis of the “ellipse” {(T'z,z) = 1} (see 
Exercise 9.6.27). 


Figure 7.3.1 Axes of an ellipse and eigenvalues. 


Let 
$:={2€V | |2||=1) 


be the unit sphere in V. We use the concepts from the theory of metric 
spaces. V is a metric space with d(z,y) := ||z—-yl]. Since V is a finite 
dimensional inner product space, it is isometric to R”. Hence the isometric 
image of S in R” is a compact subset of R” as it is closed and bounded. 
We consider the function f(z) := (Tz,z) on $. We now show that 
the function T: V - V is continuous. Let us fix an orthonormal basis 
{ei}, (where n = dim V) of V. Then for any z € V, we can write 
r= Sze, with llr? = (2.2) = La}. Note that |z,] < \|z||. We claim 


Scanned by CamScanner 


ar Symmetric Matrices 
piagonalizatio’ of Sy 181 


pat {ITZ < C\|z|| for some constant C > 0 and for all ze y. 
{he = ’ 


>> 2Te; 


\|T=\| 


|z;| \|Te;|| 


IA 


delle {esl 


(Sire llzII. 


If we take C = }/; ||Te;||, the claim obtains. From this it follows that 


lA 


IA 


\|Tz — Ty|] = ||T(z- y)|| < C]]z- yl 
and hence the (uniform) continuity of T. 


Since T is continuous and the inner product is continuous, f is a real 
valued continuous function on V: For, 


f(ct+h)— f(z) = (T(x+h),2 +h) - (Tz,2) 
= 2(Tz,h) + (Th,h) 


where in the second equality we used the symmetry of T. It follows that 
\f(z +h) - f(c)| < 2|Tzl/h+|TA|hA 30 


as h + Q, thanks to the continuity of T and of the norm function on V. 


The function f is continuous on the compact set S. Hence it assumes @ 
minimum, say, A on S, at zp € S. 


Claim 1: 2 is an eigenvalue and Zo is an eigenvector. 
This follows from 


Claim 2: (Tzo,y) = 0 for all y'€ V with y 1 20. 


Claim 2 = Claim 1: Claim 2 means that Tzo must lie in the one-dimens- 
ional space spanned by zo, that is, Tz9 = xo for some scalar pp. But this 
scalar » must be A: 4 = (Tzo,Z0) = A. Hence Tro = Azo. Thus Claim | 
and hence the theorem is proved. . 

We now prove Claim 2: The idea of the proof is simple. We consider a 
curveg:R—7S such that g(0) = Zo and consider the one variable function 


t+» f(g(t)). Since this function attains a minimum at t = 0, its derivative 
must be 0 at that point. Computing the derivative gives the result. 


—— 


Scanned by CamScanner 


182 Diagonalization 
SOR ee 


Now to get to work, let y € V be such that (zo, y) =0. Let 
a(t) = zo + ty. 


Then |j=(¢)||? = 1+ lull. Let u(t) := (1+? |lyl|?)71/?(zo + ty). Then 
clearly u(t) € S for all t € R. Consider the function A: t > (Tu(t), u(t))- 
By our assumption on zo, this function attains a minimum at t = 0 and 
hence h’(0) = 0. We compute the derivative of A: 


"(tlco = = (Tut) u(t) roo 


= (+2 lly Pleo +ty).20 + HY) hoo 
= F(t lil?) Deco ((7lz0-+ ty) 20+ ty) =o 


Katona | 
+(1+# llyll)7? leo (2 (0 + ty), 20 + ty)) Ie=o 


= (-(1+¢ fyll’)-22¢lyll)le=0 A 
+5 ( (T2020) +t (Tan, y) +t (Ty, 20) +t? (Ty, y)) le=o 
= 0+(Tzo,y) + (Ty, 20). 


Since T is symmetric, the last term on the right side is 2(Tz9,y). Hence 
2! (0) = 0 if and only if 2 (Tzp,y) = 0. This completes the proof of Claim 2. 

We may also consider another curve (in place of z(t) above) which arises 
more geometrically as follows: Let zo € S be as above. Let y € S with 
z Ly. Then, zo and y span a two-dimensional vector subspace (a plane 
through the origin) which intersects the sphere S along a great circle (see 
Figure 7.3.2). This curve on S is nothing other than the unit circle on the 


Figure 7.3.2 Sphere, 


Scanned by CamScanner 


Diagonalization of Symmetric Matrices 183 


plane Rrp + Ry. Since ||zo|| = 1 = ||y|| and (zo, y) = 0, this curve is given 
by 


c(t) = cost zo + sint y. 


(We invite the reader to check that c(t) € S.) Proceeding as earlier, we 
again get the result (Tz, y) = 0. 


0 
I] hope that the reader enjoyed the second proof even though it could 


have been a little overwhelming. I suggest that he goes through this proof 
a couple of times more to relish it. 


— a 


Scanned by CamScanner 


8. Classification of Quadricg 


8.1 Conics and Qaudrics 

Recall the definition of a conic in coordinate geometry of the plane. A 
conic is the locus of the points in R’ satisfying a quadratic equation in two 
variables of the form 


f (2,22) == az} + 2ay,zy29 + 7925 + bx, + bor, +c=0. (8.11) 


It is convenient to write the above equation in matrix notation. Let 


a=(™ a B=(by,b2), and z=(2, 
Q)2 07 72 


Then Equation (8.1.1) can be written as 
{(X) := X'AX + BX +C =0. (8.1.2) 


A quadric is the analogue of conics in higher dimension. It is defined as the 
locus of a quadratic equation in n variables given by 


f(z) = yy ayeia; +) dy +o=0 
ij 


= X'AX+BX40 


(8.1.3) 


where A = (aj) is an n x n symmetric matrix and B = (bj,... bn): 

We simplify Equation (8.1.3) using an orthogonal transformation to 
diagonalize A and then a translation to eliminate as much as possible the 
linear term BX +C. 

We suggest that the reader works out the case n = 2 and n = 3 in the 
following computations. 


184 


Scanned by CamScanner 


First a general computation: A translation leaves the quadratic coefficient 
matrix A invariant. Let X = Y + K, K a constant vector 


Vv) 


Un 
Then we have 
f(X) = (Y+K)SA(Y+K)+B(Y+K)+C 
= YTAY+Y'AK+K‘tAY+K'AK+BY+BK+C 
= Y'AY +2K'AY +BY +BK+C' (C'=C+K'AK) 
= Y'AY+(2K'A+B)¥+BK+C’ 
since Y‘ AK is a scalar and hence equals its transpose 
(Y'AK)? = K'A'Y = K‘AY. 
On the other hand, how does a linear change of variables affect the coeffi- | 
cient matrix A? Let X = PY. Then 
f(X) =(PY)'A(PY) + BPY+C 


8.1.4 
=Y'PtAPY + BPY+C =) 


Thus the coefficient matrix A changes into P'AP. 

We can now effect an orthogonal change of variable so that P is an 
orthogonal matrix which diagonalizes A, that is, P~!AP is diagonal. Thus 
under the orthogonal transformation Equation (8.1.3) becomes 


F(X) = Ary? tes + AhyA t+ bigs +++ + ban +e. (8.1.5) 


We now eliminate the b;’s associated with nonzero 4,’s by the standard 
trick of “completing the square”. Use the translation. Put z; = yi + xY¢- 
Then the term ;y? + diy; becomes 


b? 
y i ed OE 
evr 
f Xr 0. 
grits Equation (8.1.5) we may permute the variables so that A,,... ,Ar 


are nonzero and also 4; 2 *"' 2 Ar. Now completing the squares as 
described above in the indices 1 < 1 <r, we can write Equation (8.1.5) 


as 
f(X) = Na? tee + Arte + Drei zpgy beep bain tc! (8.1.6) 


Scanned by CamScanner 


186 Classification of Quadrics 


Tr b? a] . 
where 2, = yy for k > r and c! =c-)),_, zt. The linear part can be 


changed by a rigid motion into the form dé,+1, for some scalar ‘d’ without 
affecting the first r variables as follows: Consider the linear form 


[Eigse Zn) 4 Deg izrg toot + Onzn $C. (8.1.7) 


By assumption B = (0,... ,0,b,41,--. bn) #0. Let d:= ||B'|| = TPL 
be the norm. 

We first use translation to kill the constant term c’. Let & be such that 
b, #0. Let 


! 

C 
Ni = iy i¢k and m=A+—. 

by 


Then in the 7 variables, the linear form Equation (8.1.7) takes the form 
(My.++ Mn) brsatesy te t Ont: (8.1.8) 


Now consider the unit vector 


Fi 10; braty++. bn). 


This vector is orthogonal to the unit vectors 
0 See (| ere | Peres (100 a |) (r terms ). 


Hence there exists an orthogonal transformation which takes these vectors 
b 

to themselves and (0... 10; b-1,+++ On) into (0,...,0,1,0,... ,0). Let 

the coordinates with respect to this new orthonormal basis be ¢1,... ,n. 


(Note that ¢; =n; for 1 <i <r). Then the linear form Equation (8.1.7) 
takes the form 


(Syyare 1§n) 9 dees, (8.1.9) 


Notice that the rigid motion effected so far does not affect the first 
r-coordinates at all. 

Thus Equation (8.1.6) looks like one of the following after all the change 
of coordinates. 


fy oh 
£8) = Anzp t+ + An? + dea (8.1.10) 


{aA st Aad (81.11) 


where \; >-"°2 Ar, Aj nonzero, d > 0, 


Scanned by CamScanner 


conics and Quadrics 


187 

We have thus proved the following result, 
Theorem 8.1.1 Under the Euclidean group of rigid moti 
form in n variables such as Frigid motions, any quadratic 


f(z) = De ayia; +) bere +o=0 
. 


=X'AX+BX+46 


can be brought into one of the forms, 


F(€) = Arzp tes +A-€? + dé) 


or 


f(€) = Ati te +A. +e! 


8.1.1 Classification of Quadrics 


We present the standard forms of conics in R? and those of quadrics in R° 
in the following tables. 


Canonical Forms of Conics 


Name of the Conic 
ie dl eal 


Ellipse 


Imaginary ellipse 


Point ellipse 
Hyperbola 
Intersecting lines 


Parabola 
Parallel lines 


Imaginary lines 


Coincident lines 


ss 


Scanned by CamScanner 


188 . Classification of Quad 


Canonical Forms of Quadries 


T 7 ay 
No. | Equation Name of the Quadric 
ee 
| =+54+—=1 | Ellipsoid 
ae 6 
roy 2 ceded 
2, +—+—=-] | Imaginary ellipsoid 
a 
- it. ¢ 
, 3. | =+i-—+=1 | Hyperboloid of one sheet 
ce ¢ 
roy 2 
4. | —=+=-—=-1 | Hyperboloid of two sheets 
oh e 
1d. | ata =22 Elliptic paraboloid 
ay b 
{ry 
6. | =-S=2: Hyperbolic paraboloid 
| eo 8 
_ xy 2 ee Prac 
i |at 4 —=( | Point (imaginary elliptic cone) 
ae PB ¢ 
poy 2 
& later —=0 | Elliptic cone 
a b og 
| zy iit eo cke 
i] Se te, Elliptic cylinder 
a be 
voy ; eT 
LD | t=] Imaginary elliptic cylinder 
a,b, 
vy ee 
a ee) Hyperbolic cylinder 
ab 
7 = 
12 a= 4py Parabolic cylinder 
2. ee ete 
13. at 0 Line (imaginary intersecting plane) 
ry ; 
14. ap =0 Intersecting plane 
15. | =a" Parallel planes 
1. 2 ‘ 
| 16. | z°=-a Imaginary parallel planes 
Coincident planes 


The figures of some quadric surfaces can be found at the end of thi 
chapter. 


Scanned by CamScanner 


| 


Computational Examples 189 


8.2 Computational Examples 
We shall illustrate the above theoretical results in some concrete cases. 


Example 8.2.1 Let us consider the conic defined by 2zy = 1. The matrix 
Ais 

0 1 

1 0 


By (geometric) inspection we see that e; + €2 is an eigenvector with eigen- 
value 2. Hence the other eigenvector must be perpendicular to this vector. 
Thus +(e2 — €;) is an eigenvector with eigenvalue —2. We choose as the 
orthonormal basis the basis consisting of eigenvectors {(e, + €2)/ V2, (€2 - 
e:)/\/2}. (This choice is made so that the new axes are got by a rotation 
from the standard ones). Let the coordinates with respect to this new basis 
be denoted by (u,v). Then the relation between the old and the new ones 
is given by « = (u—v)/V2 and y = (u+v)/V2 (see the section on the 
coordinates with respect to an orthonormal basis). Thus the equation reads 
in the new coordinates as u? — v? = 1. 


Example 8.2.2 Consider the conic given by the equation 
2x? — 73zy + 23y? + 140r — 20y + 50 = 0. 
The matrix A is 
2 -36 
-36 23 


The characteristic equation is given by 


-r -36 
det : = 0. 
-36 23-A 


Thus the eigenvalues are the roots of the equation A? — 25A—1250 = 0. The 
cigenvalues are 50 and —25. We wish to find the corresponding eigenvectors. 
We thus have to solve for the system Az = Az. That is, solve 
2 -36 z ‘e 
-36 23 y y 
in the system of equations 
2x — 36y 
—36zr + 23y 


This results 


HT] 
on 
oe 
g 


| 
= 


_ er — aa _ pe 


Scanned by CamScanner 


190 ___ Classification of Quadric 
——— 


The first equation becomes -48r - 36y = 0 so that the column vecto, 
(3,-4)/ is an eigenvector. The vector perpendicular to this with the 
“correct orientation” is (4,3)7. Thus the orthogonal matrix which 
diagonalizes A is 
3/5 4/5 
-4/5 3/5 
The given equation becomes 


50u? - 25v? + 1402 - 20y +50 = 0. 


The coordinates are related by z = (3u + 4v)/5 and y = (—4u + 3v)/5, 
Using this substitution, we get 50u? - 25? + 100u + 100v + 50 = 0. We 
now complete the squares 


50(u + 1)? - 50 - 25(v - 2)? + 100 + 50 = 0. 


We now effect a translation X = u+1 and Y = v - 2 to get the equation 
in the form: 50X? - 25Y? + 100 = 0. This can be cast in the standard 
form X2/2-Y?2/4.=-1. Note that the standard coordinates and the last 
coordinates are related by 


3 3 


4 4 
y= (eX -s¥41-2X+ eV +2]. 
(z,y) (23 sy Lex +5 +2) 


Ve can use this information to draw the conic. 


Example 8.2.3 Consider the quadric in R? given by 
f(z,y, 2) := 422 +dy’ + 8y+8=0. 


The matrix of coefficients of the second degree terms is 


002 
A=10 4 0 
200 


A (geometric) look at the matrix shows that e is an eigenvector with 
eigenvalue 4. (For, observe that the second column is Ae.) Also, the 
plane perpendicular to the y-axis, namely, the zz-plane is mapped by this 
operator A to itself. Another look shows that ¢ +e, is an eigenvector with 
eigenvalue 2 and ¢ - Eat cigenvector with eigenvalue —2. Thus as 
thonormal basis consisting of eigenvectors of A, we take 


{(e, + ¢3)/V2,e2, (ey = ¢3)/V2}, 


an 
or 


Scanned by CamScanner 


Computational Examples . 


The orthogonal matrix which takes the standard basis to this eigen basis 
is, of course, given by 


1/¥2 0 -1/V2 


0 1 O 
1/V2 0 1/V2 


Now let 21, Y2,2; be the new coordinates associated with this eigen basis, 
The old and new coordinates are related by z = (1/./2)(z, — z;), y=y; 
and z = (1//2)(z; — z,) (see Section 5.8). Hence the given polynomial 
becomes in the new coordinates 


22? + 4y? — 227 + 8y, +8 =0. 
We now complete the squares in the y; variables to get 
22? + 4(y, +1)? — 4 — 227 + 8y, +8 =0. 


We now effect a change of coordinates z2 = 2), y2 = (y) +1) and 22 = 
so that the given polynomial becomes 


223 + 4y2 — 223 +4=0. 


Thus the given conic section is a hyperboloid of two sheets as can be seen 
from the table. 


Example 8.2.4 Consider the quadric surface defined by 
3x? + Qay + 4yz + zz — 2x — dy — 22-9 =0. 


Then 
S41 -1 
A=]1 0 2] andb:= | _7 
1 2 0 -1 


This time we look for a centre of the quadric. Thus we wish to find solutions 
of Ar = b. Solving the system of equations we get (—3/2,5/4, 17/4) as the 
centre of the quadric. The eigenvalues are easily found to be 1, 4 and —2. 
Using the translation 2 = 7 — 3/2, y=y' + 5/4 and z = z’ + 17/4, we 
eliminate the first degree terms without affecting the second degree terms. 
Thus to find the transformed equation we need only commute the constant 
term with respect to the above substitution: 


3(-5) +2(-3)(9) +2(-3)(@) +44) 
-(-$)-(8)-2(!) 9-9 


Scanned by CamScanner 


192 Classification of Quadries 


Thus the standard form of the given quadric is dy!" ~ 22! 420 = 0, 
Exercise 8.2.1 Reduce the following into standard forms: 
(1) 12? + Gry + 19y? - 80, 
(2) 2x? - Sy? +3z + 10y, 
(3) 162? - 2dzy + 9y? - 302 - 40y. 
(4) 82? ~ L2zy + 17y? - 80. 
(5) 32? + Qry + 3y? 4, 
(6) 52? - 8ry + 5y? ~9, 
(7) 22? + 3zy - 2y? - 10, 
(8) 22? + dry + 2y? ~ 64, 
Exercise 8,2.2 Reduce the following into standard forms: 
(1) dz + 4y? + By +8, 
(2) 92? - dry +6y? +32? + or 4 AVby + 122 +16. 
(3) 2? + y? 72? — dry - faz Ayz + By + 142-6, 
(4) 2? + dy? + 42? + doy fry - Byz + r+ By +7 
(5) 82? ~ zy + 4rz- dy + Dn + dy - 62 ~ 29, 
(6) 2? + 6z2y - 2y? ~ 3x24 42 
(7) -22? - My? ~ 52? + dry 4 l6yz + 2022 
(8) 3? — y? - 32? 4.9/2 - day ~ lOyz, 


Scanned by CamScanner 


103 


xamples 


ry 
7) 


————— 


Computational ] 


Hyperboloid of one sheet. 


Figure 8.1.1 


Figure 8.1.2 Ellipsoid. 


Figure 8.1.3 Hyperboloid of two sheets. 


| 


Scanned by CamScanner 


194 Classification 
—— of 
<O Quad 


Figure 8.1.5 Hyperbolic paraboloid 


a 
Scanned by CamScanner 


195 


computational Examples 


» 
\ 
Bs 


Axi 


Figure 8.1.6 Elliptic cone. 


FFF | 


VELL 


Figure 8.1.7 Elliptic cylinder. 


Scanned by CamScanner 


CS 


Classification of Quadyj 


~ 


6 
_— 


19 


Figure 8.1.8 Hyperbolic cylinder. 


arabolic cylinder 


§ P 


8.] 


Figure 


— 


Fe 


Scanned by CamScanner 


9. Review Problems 


[In this chapter, we give lots of problems for practice. Some of the problems 
below appeared either as lemmas/theorems or were listed as exercises ear- 
lier. The point of giving the collection here is to help the reader to assess 
his overall understanding of linear algebra. 

Unless specified otherwise, V stands for a finite dimensional vector space 
over R, R" is always equipped with the dot product or the Euclidean inner 
product (x,y) := 07, tiyi- Here x = (zy,... ,2n) = ).; Ties where e; are 
the standard basis vectors. 


If a problem is just a statement, you are asked to provide a proof for it. 


9.1 Linear Equations 


Exercise 9.1.1 Find a necessary and sufficient condition for either the 


sum of two solutions or the scalar multiple by a number a (a # 1), to bea 
solution again of the same system of equations. 


Exercise 9.1.2 Under what conditions will a given linear combination of 


any solutions of a given non-homogeneous system of linear equations be 
again a solution of the same system? 


Exercise 9.1.3 Consider all possible cases encountered in solving systems 


of linear equations involving two or three unknowns. Give the geometric 
interpretation in each case. 


9.2 Linear Dependence 


Exercise 9.2.1 Prove that a set of vectors containing the null vector is 
linearly dependent. 


Exercise 9.2.2 Prove that a set of vectors, two of whose vectors differ 
only by a scalar multiple is linearly dependent. 


197 


Scanned by CamScanner 


Exercise 9.2.3 Prove that if, in a set of vectors, some subset js linear! 
dependent, then the full set is linearly dependent. tly 


Exercise 9.2.4 Prove that any subset of a linearly independent set j 
linearly independent. 8 


Exercise 9.2.5 Suppose {z\}{L, be linearly independent, but {y} U {2;} 
is not. Then y can be written uniquely as a linear combination of z,, 


Exercise 9.2.6 Is there a converse of Exercise 9.2.5? 


Exercise 9.2.7 Let a,,c be distinct real numbers. Is the following set of 
polynomials linearly independent? 


{(X - a)(X - 6), (X  b)(X - ¢), (X - c)(X —a)}. 


Exercise 9.2.8 Prove that in P,, any finite set consisting of polynomials of 
different degrees, not containing the zero polynomial is linearly independent. 


Exercise 9.2.9 Determine whether the following sets are linearly dependent, 
(1) {z, = (-3,1,5),22 = (6, -2, 15)}. 
(2) {zy = (-1,2,3),22 = (2,5,7),23 = (8,7, 10 + €), € 4 0}. 
(3) {z, = (4,-12, 28), 29 = (-7, 21, -49)}. 


9.3 Basis and Dimension 


Exercise 9.3.1 Prove that 


(1) Any nonzero vector can be enlarged into a basis. 


(2) Any linearly independent set~can be enlarged to a basis of the vector 
space. 


Exercise 9.3.2 Find two different basis of R? having e; = (1,0,0) and 
e = (0,1,0) in common. . 


Exercise 9.3.3 Prove that in the space P,, any set of nonzero polynomials 


containing one polynomial of each degree k, k = 0,1,...,n is a basis. 


Exercise 9.3.4 Show that any basis is a maximal linearly independent set 
and a minimal set of generators. 


Scanned by CamScanner 


js and Dimension 
5 


199 
— °° ———— 


orcise 9.3.5 Find the coordinates of the polynomial ¢5—¢4 + 43 _ 42 rere 
ee of the following bases of Ps, 
1 


(1) (1,t,¢7, 03,08, 8}. 

t+1,t7 41,241, +1, +1}. 
(2) (l,t+ 
3) (1+ 8,t+ t,t? + 8,09, e+ oF 08 + £3}. 


Exercise 9.3.6 Prove that the span of an arbitrary subset of a vector space 
V is a vector subspace. 


Exercise 9.3.7 Let W C V be a subspace. Show that dimW < dimV. 
When does equality hold? 


Exercise 9.3.8 Prove that in an n-dimensional vector space V, a vector 
subspace W of dimension k can be found for any k = 1,2,... ,n. 

Exercise 9.3.9 Construct a basis of Ps consisting of polynomials of degree 
5. Can you construct a basis in which the degree of its members < 4? 


Exercise 9.3.10 Find a basis and the dimension of the linear subspace of 
R”" given by 21 +22 +---+2, =0. 


Exercise 9.3.11 In Py, each of Wi = {f(0) = 0}, Wo = {f(1) = 0}, 
W3 = (f(a) = 0}, Wa = {f(0) = f(1) = 0} is a vector subspace. Find 
their dimensions. 


Exercise 9.3.12 Find the coordinates of the polynomial f(X) = 7", a; X° 
with respect to the bases: 


(1) The basis {1, X,X?,...,X"}. 
(2) The basis {1,(X — a), (X -—a@)?,... ,(X -@)*}. 


Exercise 9.3.13 Prove that if the sum of the dimensions of two vector 
subspaces of an n-dimensional vector space exceed n, then the subspaces 
have a nonzero vector in common. 

In R3, is it possible to have two subspaces W, and W2 such that 
dim W, = dim W2 = 2, Wi We = {0}? 
Give the geometric meaning of the above. Can you generalize this? 
| Exercise 9.3.14 Prove that the following set of vectors in 
| subspace and find a basis and the dimension of each. 


Rr 


form a linear 
(1) All n-vectors whose first and the last coordinates are equal. 

(2) All n-vectors whose even entries are zero. 

(3) All n-vectors whose even entries are equal. 

Exercise 9.3.15 Prove that 

dim W + dim W2 = dim(W, + W2) + dim(Wy a Wa). 

| 

LL 


ee 


Scanned by CamScanner 


9.4 Linear Transformations 


Exercise 9.4.1 Find all linear transformations on a vector space having 
dimension 1. 


Exercise 9.4.2 Prove that any linear transformation maps a linearly 
dependent set to a linearly dependent set. 


Exercise 9.4.3 Is it true that any linearly independent set is mapped to 
another linearly independent set under a linear transformation? 


Exercise 9.4.4 If W C V, show that T(W) is a subspace and also that 
dim7T(W) < dim W. 


Exercise 9.4.5 Show that a linear transformation is determined once we 
know its effect on a basis. 

Exercise 9.4.6 Let {e;}"_, bea basis of V. Also let {y;}'_, C W, another 
vector space. Show that there exists a unique linear transformation T such 
that Te; = y. 


Exercise 9.4.7 Let {a),...,¢,} and {y,...,Yn} be arbitrary subsets of 
V. Does there exist a unique linear transformation T such that Tz; = y;? 


Exercise 9.4.8 Let W be a subspace of V and T: W — X be a linear 
transformation. Show that there exists T: V + X such that T(w) = T(w) 
for all w € W. Is this T unique? 


Exercise 9.4.9 Show that the kernel and image of a linear transformation 
are linear subspaces. 


Exercise 9.4.10 If W is a subspace of V, is there a linear transformation 


T: V +Y (Y is given) such that ker T = W? (Answer depends on whether 
dim Y > dim V or not!). 


Exercise 9.4.11 Find two different linear transformations having the same 
kernel and image. 


Exercise 9.4.12 Show that the multiplication of 2x2 matrices by o* 

_* c d 
on the left (right) is a linear transformation. Find its matrix with respect 
to the basis 


1 0 0 0 01) {0 0 


eee 
———— oo t—‘(—C—CtsS~S — 


Scanned by CamScanner 


Linear Transformations 201 
leiecleiste 


—_————oo 


Exercise 9.4.13 Find the matrix of ae on P, with respect to the basis 


(1) {1,X,X%,---,X"}. (2) {1 (x ~a), 252 ah 


2’ n! 
Exercise 9.4.14 What change will the matrix of a linear transformation 
undergo if two vectors {e;,¢;} of the basis {e,... ,e,} are interchanged? 


Exercise 9.4.15 Prove that the matrices of the same linear transformation 
with respect to two different bases coincide if and only if the transition 
matrix from the bases commute with the matrix of the linear transformation 
with respect to one of the bases. 

Exercise 9.4.16 Find those subspaces of P,, which remain invariant under 
d 

dz’ 


Exercise 9.4.17 Find the kernel of the following linear transformations: 


(1) T: R? - R? given by 


z a zt+y 
y ae) 
(2) T: BR’ — R? given by 
r 
2 
y}r> 
. y 
(3) T: R4 + R? given by 
Z 
Vie [tty é 
z z+w 
w 


Exercise 9.4.18 Let T: R° + R® be given by 


r 10 1 xz 
r yj=]1 1 2 y 
z 21 3 z 


Show that Tz = y has a solution only if z—z-y=0. Is T onto? Finda 


basis for the range. What is the kernel of T? 


 _—_ | 
Scanned by CamScanner 


202 Review Problems 


Exercise 9.4.19 La T: B’ + B* be given by 


ind the kernel of T. Is T one-one? Is T onto? 


Exercise 9.4.20 Find the kernel and range of T, and their bases ang 
ii i ost T is given by 


Zz 42 2 z 
: i? 23 -1 y 
2 -i 1 -2 2 


Is T one-one? 


Exercise 9.4.21 If T is given by 


find a basis for the kernel and range of T. Verify the dimension formula. 


Exercise 9.4.22 True or false: If {T(z;)}"L, is linearly independent then 
tay, cn ta) SOT 


Exercise 9.4.23 Prove that the linear map T: V — W is one-one if and 
only if dim(Im T) = dimV. 


Exercise 9.4.24 Construct a linear transformation T : R? 3 R° such that 
Im (T) = {r+ y+2=0}. 


Exercise 9.4.25 Construct a linear transformation T : R? 3 R* such that 
Im (T) = {Xie, 7 = 0}. 


Leiz=) 


Exercise 9.4.26 Can you formulate an exercise of which Exercises 9.4.24 
and 9.4.25 are special cases? 


Exercise 9.4.27 Find the linear transformations F ; R = R such that 


(1) F(-1) =2. 


Scanned by CamScanner 


& * . 
guclidean Spaces 


_ 


‘ 


= 
~ 


(4) F(3) = j, 
(3) F(Q) = ~2. 


/ 


(4) F(1)= 2 and F(2) =, 


ise 9.4. jnd the linear tran , _ 
pxercise 9.4.28 Find the linear translorinztions P - p? _, anion 


(1) F(-1,1) = (0,1). 
(2) F(1,1) = (2,0), F(—1,1) = (9,1), 


(3) F(A) = F(B) where A={(z,y) | y=22}, B={(z,y) | 7 = 0), 
Write the matrix relative to the standard basis in each case. 


Exercise 9.4.29 Define 
F Zi) f[z+y 
y y 


What is the matrix of F? Let S be the square with vertices (0, Q), (0,1), 
(1,0), and (1,1). What is F(S)? 


Exercise 9.4.30 Let £; = R(0,0,1)' and 4 = R(2,1,0)'. Find 2 map 
F : 3 = R® such that F(é,) = F(&). 
Also, find F such that F(0,0,1)' = (2,1,0)¢. 


Exercise 9.4.31 Find F : R® + R® such that {z = 0} goes to {y = 0}. 


Exercise 9.4.32 Let P; be the plane z+ y — z = 0 and P2 be the plane 
2y+2z—2=0. Find F: R°  R° such that 


(1) F(P,) = Po. 
(2) F(P)) = P2 and F(P2) = Fis 


Exercise 9.4.33 Let {z = 0} be the yz-coordinate plane. Find a linear 
transformation F : R° + R° which maps it to a parallel plane {z = 1}. 


9.5 Euclidean Spaces 


Exercise 9.5.1 Prove that in a Euclidean space the zero vector is the only 
one which is orthogonal to all vectors. Show that if (a,z) = (b,z) for all 
réV,thena=b. 


Exercise 9.5.2 If {z1,--- »Tn} is an orthogonal set, then {ay z1,... ,@nZn} 
is an orthogonal set for all a; € R. 


Scanned by CamScanner 


204 Review Problems 
OOS 


Exercise 9.5.3 If x 1 yi, 1$¢ <n, then z is perpendicular to any linear 
combination of yj. 


Exercise 9.5.4 Prove that an orthogonal set of nonzero vectors is linearly 
independent. 


Exercise 9.5.5 Apply the Gram-Schmidt process to 
ty) = i; —2,2), x2 = (-1,0, 1), 23 — (5, -3, -7) 


in R° with the dot product. 


Exercise 9.5.6 Prove that the inner product of any two vectors z and y 
of a Euclidean space is expressed in terms of their coordinates with respect 
to a certain fixed basis {e;} by the formula (z,y) := )> zjy; if and only if 
{e;} is an orthonormal basis. 


Exercise 9.5.7 Find the dimension of the subspace formed by all vectors 
z such that (z,a) = 0 for a fixed vector a. 


Exercise 9.5.8 Let V have a basis {e;} over R. We can then define an 
inner product on V such that {e;} becomes an orthonormal basis with 
respect to this inner product. 


Exercise 9.5.9 Define an inner product on P” such that 


tk 


P,.(t) = my 


k=(0)1,.. 


becomes orthonormal. 


Exercise 9.5.10 On P” define (p,q) := hy p(t)g(t) dt. Is this an inner 
product? 


Exercise 9.5.11 Problems on orthogonal complements: 
(1) (L+)* = 
(2) V CW implies V+ > Wt. 
(3) (V+W)t= =Vinwt. 

(4) E=V OW implies E=V*+@W?. 


a= 9.5.12 In P”, define (f,9) = )o/.9 aib). Here f(x) = Tae! 
g(z) = So biz. Find the sari l apna of all polynomials sat- 
1 the condition f(1) = 0 and do the same for the subspace of all 
polynomials of even degree. 


Scanned by CamScanner 


problems in Linear Geometry 


pxercise 9.5.13 Prove the cosine law for triangles given by z and 


gxercise 9.5.14 Prove the Pythagoras theorem and its converse. nz 
that two vectors r and y are orthogonal if and only if ||z — y\\*7 


yl. is 


Exercise 9.5.15 Prove that ||z|| = ||y|| if and only ifz+y 1 z—y. What 
is the geometric meaning underlying this? 
Exercise 9.5.16 Let z € R” be such that ||z|| = 1. Let cosa; := 


oe te, €), 
{e;} an orthonormal basis. Then )> cos? a; = 1. Do you understand the 
meaning of this? 


Lines 


Let €(p;d):={p+td | t€R}=p+Rd for p,d€ R” fixed and d¥0. 


Exercise 9.5.17 Two lines (p;d,) and £(p;d2) are the same if and only if 
d,; = ad for some nonzero a € R. 


Exercise 9.5.18 Two lines é(p;d) and &(g;d) are equal if and only if 
q € &(p; 4). 


Exercise 9.5.19 Two lines f(p;d) and €(p;d2) are said to be parallel if 
and only if their direction vectors are parallel (that is, d) = ad2, a # 0). 


Given a line @, and q ¢ @, there exists a unique line f’ with g € é and 
1s. 


Exercise 9.5.20 Two distinct points p,q determine a line. In fact the line 
is {p+t(q—p) | te R}. 


Exercise 9.5.21 Two vectors are linearly dependent if and only if they lie 
on the same line through the origin. 


Exercise 9.5.22 Given two parallel lines €(p;d) and €(g;d’), then either 
€(p; d) = &(q;d') or &(p;d) NE(q; a’) = 9. 


Exercise 9.5.23 Given two lines €(p;d) and €(g;d) which are not parallel, 
prove that their intersection is either empty or consists of exactly one point. 


9.6 Problems in Linear Geometry 


Exercise 9.6.1 Find the angle between the vectors (1, 1, 1) and (1,0, 1) in 
R3, Find a vector of length 7 perpendicular to both these vectors. 


Exercise 9.6.2 Find the orthogonal projection of (2,0) in the direction of 
the vector (1,1)- 


Scanned by CamScanner 


206 Review Problems 
ee ace a 


Exercise 9.6.3 Let z.y eV. 


(1) What can you say when ||z|| + |ly/] = ||z + ||? 
2) Show that | |jz|] —[ly/] | < [lz—yll- 
(3) Assume that ||z|| = ||y|] #0. What can you say about z— y and 
=> + vy? 
Exercise 9.6.4 In V, we define d(z,y) := ||z—y||. Show that dis a 


distance function, that is, a metric. Do you recognize Exercise 9.6.3 (2) 


oy) 
now. 


Exercise 9.6.5 Find the distance between (1,2,0) and (-1,3,4) in R°. 


Exercise 9.6.6 Le: the elements of R? be written with respect to the 
standard basis e; = (1.0) and e2 = (0,1). Thus any u € R? is written 
as u = (z.y) if u = ze; + yer. Define a map (,) from R? x R? to Ras 
follows: 


(u,v) = (22+ y)z, + (z+ 5y)y, 


where u = (z.y) and v = (z;,y,). Show that (,) defines a new inner 
product on R?. , 
If you are curious as to how I thought of this crazy definition, the clue 


lies in the spectral theorem and the characterization of positive definite 
matrices in terms of their “principal minors.” Here the matrix is : ’ 
1 5 


The above exercise is simple but highly instructive; it needs perhaps 
some high school algebra! 


Exercise 9.6.7 Let P,, be the space of all polynomials of degree < n. 
What is the dimension of ?,,? Define 


1 1 
tp.9) = plz)o(2)de and (p,q) := [ alz)a(e)de. 


What is the length of p(z) = z with respect to ( ,) and ()? In P2, apply the 
Gram-Schmidt process to the basis {1,X,X?} with respect to the above 
inner products. 


Exercise 9.6.8 Find the matrix of the linear map A: R? > R? given by 
A(z, y= (2z +3y,2- y) 


with respect to the orthonormal basis {(1, 1)/V3, (1, -1)/V3}. 


Exercise 9.6.9 Write down all the orthogonal transformations of R?. 


Scanned by CamScanner 


Problems in Linear Geometry 207 


Exercise 9.6.10 Let W be a vector subspace of a finite dimensional inner 
product space V. Show that there exists 4 unique vector subspace W+ of 
V such that any z € V can be written uniquely as z:= y+2 with yc w 
and z € W+ with (W,W-+) =0. 


Exercise 9.6.11 Let V be any vector space over K not necessarily with 
an inner product. A norm on V by definition is a function || ||: V 4B 
satisfying the following conditions: 


(1) \\z|| > 0 and ||z|| = 0 if and only ifz=0. 
(2) \|az|| = |a|\|z|| for allae RandzreV. 
(3) \jz+yll < [lz] +|lyll, for z,y € V. 


Thus in an inner product space we have a naturally defined norm 
Nall c= (a, z)?, However there may exist other norms on a vector space. 


(a) Show that if V is a finite dimensional vector space over R and {v,}"_, 
is a basis of V, then for z = }°; z;u;, the functions 


n 


lizi, := Soleil, 
t=] 
| | max{|zi[}, 
e 1/2 
| Ps (S=") ‘ 
i=) 


are norms on V. 


(b) Let || || be a norm on V. Show that d(z,y) := ||x — y|| is a metric 
on V (see Exercise 9.6.4). 


(c) Let B(z,r) := {yeV | \[z — y|| <r}. Then B(z,r) is called an 
open ball of radius r centred at z. Let V = R". Let v; = e; be the 
standard basis vectors. Sketch the balls of radius 1 centred at the 
origin with respect to various norms of (a). 

(d) Show that there exists constants C) and Cy» such that 

Cillzll < llzlly < Calley, 


| 
where the above are any two norms of (a) on Rn 


Scanned by CamScanner 


208 Review Problems 


(e) Show that (z,) € R” converges to z € R® if and only if “ converges 
toz; forl <i<n. Herex:=)), zt, and the convergence is with 


respect to the metric defined by any of the above norms. 


(f) Let M(n.R) be the set of all n x n matrices with real entries. Then 
M(n,R) is a vector space of dimension n? over R under “natural 


operations”. Let || A]| := max {laijl}- 


(i) Show that || || is a norm on M(n,). 

(ii) For any z € R” where R” is endowed with the Euclidean norm, 
we have || Az|| < n]||A]] |z|]. 

(iii) Can you think of more natural norms which use the fact that A 
is a (linear) map on R” rather than being just a vector in R™’? 


Exercise 9.6.12 Recall the definition of a line in a vector space V not 
necessarily with an inner product. d € V is a nonzero vector and p € V 
any point. Then the line €(p,d) through p having the direction d (or with 
direction d) is the set 


(p,d):={zreV | c:=p+td, for some t € R}. 
Prove the following theorems completing them if necessary: 


(1) &(p,a) and &(p,b) are equal if and only if their directions are parallel, 
that is, if and only if there exists a nonzero real a such that aa = b. 
(2) &(p,a) = €(q, a) if and only if... . 


(3) Definition: Two lines are said to be parallel if and only if their 
directions are parallel. Given a line £ and a point q ¢ @, then there 
exists a unique line é’ such that q € €’ and € is parallel to é’. (Euclid’s 


parallel axiom proved!) 


(4) Two distinct points p and q of V determine a line £ where 
€= Ep, 2) = €(q, ?). 


(5) é(p,a) and ¢(q, 6) intersect if and only if p—q lies in the span of ... . 
Note that none of the above theorems needed the notion of an inner 
product. However we have 


(6) In R?, a line can be described as { v € R? | w-p,.) = 0} for a 
point p on the line. N is said to be normal to the line. 


— 


Scanned by CamScanner 


ae 


problems in Linear Geometry a 


(7) Let z(t) := p+td bea line £ in R” and q ¢ £. Show that 


f(t) = lla 2()|? 

takes a minimum value exactly at one point t = ty. Prove that 
gq — z(to) is perpendicular to d (draw pictures). 

Can you generalize this to the case of Exercise 9.6.10? 


Exercise 9.6.13 Find the parametric equations of lines through the 


following pairs of points and find the midpoint of the segment between 
the pairs: 


(1) (—5, —6, 8) and (1,3,7). 
(2) (2,4,6) and (1,2,3). 

(3) (1,3,10) and (—3,6, -2). 
(4) (10,3,1) and (6, -2, -3). 


We recall the definition of a plane in R*. Given a point P and a nonzero 


vector N, the plane II(P;N) is the set of all lines through the point P 
wnich are perpendicular to N. Thus, 


M(P,N):={XeR® | (X-P)-N=0}={XeR | X-N=P-N} 


(Note the similarity between this and (6) of Exercise 9.6.12). Since P and N 
are given, P-N is a constant, say, d. If X = (x,y,z) and N = (a,6,c) then 
X-N = az+by+cz so that we get the equation of the plane az+by+cz = d. 


Exercise 9.6.14 For each of the following equations find the normal vector 
to the corresponding plane and find any point on the plane: 


(1) cs+ytz=1. 
(2) 2x + 38y —z =2. 
(3) (x — 2) + 3(y — 5) — 4(z +1) =0. 


Exercise 9.6.15 Find the equation of a plane through the three given 
points: 


(1) (1,0, 0), (0, 1,0), (0,0; 1). 
(2) (1,0, 0), (-1,0,0), (0,1, 1). 
(3) (0, 1,0), (0, 2,0); (0,0, -1). 


Ee, a eesestoneceeentsteteneeeeneeee = — 
Scanned by CamScanner 


210 Review Problems 


Exercise 9.6.16 Show that the x-axis in RB is a line. 


Exercise 9.6.17 Find the point of intersection of the lines 
(1,-5, 2) + t(-3,4,0) and (3, -13, 1) + ¢(4, 0, 1). 


Exercise 9.6.18 Prove that the line (1,3, -1) + ¢(0,3,5) lies on the plane 


Qr -5y+3: = -16. 


Exercise 9.6.19 Let p; and p2 lie on the plane {pe R® | (p,N) = d}. 
Prove that any point of the line joining p, and py lies in the plane. 


Exercise 9.6.20 Find all points of intersection of the given line and the 
plane: 


(1) ¢(1,-3,6); 2+ 3y +2 = 2. 

(2) (1,-3.6) + ¢(1,0,0); = = 6. 

(3) (1,-3.6) + ¢(1,0,0); = = 0. 
Exercise 9.6.21 Prove the following: 


(1) If B- N = 0, the line A+ tB intersects the plane P- N = Py-N 
exactly at one point. 


(2) If B- N =O and A is in the plane P- N = Py- N, then the entire line 
A+tB lies in the plane P- N = Po: N. 


(3) If B- N =0 and A is not in the plane P- N = Py - N, then the line 
A+tB does not intersect the plane P: N = Po: N at any point. 


Exercise 9.6.22 (1) Find the line through the given point (2, y:, 21) 
(say P;) and normal to the plane IT := {(z,y,z) | ax + by + cz = d}. 


(2) Find the point Po in which the line and the plane in (1) intersect. 


(3) For any point P on the plane and Pp as in (2), show that P;— Py L P- 
Po. 


(4) The distance between P, to the given plane is given by 


jax, + by, + cz, — d| 
d(P,, 0) = ————==—_—_ 
(PT) Va? + b? +c? 


Compare this with (7) of Exercise 9.6.12. Can you generalize this to 
a hyperplane IT := z+ W where W is an (n — 1)-dimensional vector 
subspace of an n-dimensional inner product space? 


Scanned by CamScanner 


Miscellaneous Problems 21 


pxercise 9.6.23 (1) Let Po lie on the sphere {z | ||z||=r}. Prove 
that the line Po + ¢B intersects the sphere in two distinct points 
unless B+ Po = 0. 


(2) The plane P - Po = Po: Po intersects the sphere only at Po. Hint: 
P=Po)+(P- Po). 


(3) Let P-N = Po-N bea plane through Py. Prove that there is a point 
P; which lies on the sphere, on this plane and on the line —P, + 4N. 
Show also that P; = Po if and only if N is parallel to Py. 


Exercise 9.6.24 Let A = (1,8,2) and B = (-—3,1,1). Show that (z, y, z) 
lies on the line A+ tB if and only if z = 7 — 3z and 6y + z. What is the 
geometric interpretation of this result? 


Exercise 9.6.25 Prove that every line is the intersection of two planes. 


Exercise 9.6.26 Show that all three medians of a triangle meet at a point 
which divides the median“in the ratio 1:2”. 


Exercise 9.6.27 Let E = R? be with the dot product. Let 
etal 0 
Oc 


Then the locus of the points {(Az,z) = 1} can be considered as a conic 
section. It is an ellipse if a > 0 and c > 0 and it is a hyperbola if a > 0 
and c < 0, for example. In the case of an ellipse, the eigenvalues of A are 
obtained by the minor and major axes of the ellipse. This may help you 
understand the proof of Theorem 7.3.8. 


9.7 Miscellaneous Problems 
Exercise 9.7.1 Answer true or false: 


(1) If the vector 0 is among {v;}{_,, then {v,}*_, are linearly dependent. 


(2) If v1, v2,03 € R* are linearly dependent, then some v; is a scalar 
multiple of some v;. 


(3) If {v1,..-,%n} € V is linearly dependent and i is given, 1 < i < 7, 
then v; is a linear combination of other v,’s, j ¥ i. 


2 ; 
(4) The set of vectors 1¢ ER | az? — y? — 0} is a vector subspace. 


(5) If S;, i =1,2 are subsets of V, and L(S\) = L(S2), then S$; S2 # 0. 


—_————Ey — - - a 


_ 
— 


Scanned by CamS 


canner 


212 Review Problems 
nn ern 


(6) The set of solutions of aX, +++: +a,X, = 5 in R” is a vector 
subspace for any a,... ,a, € R. 
Exercise 9.7.2 What are the vector subspaces of R? 


Exercise 9.7.3 If W, and W» are vector subspaces of V, are WNW, and 
W UW, vector subspaces of V? 


Exercise 9.7.4 Let W C V andz € V\W. Can you find a vector subspace 
W, such that W, D> W and ze W? 


Exercise 9.7.5 Consider P,,. Let @ be a real number. Show that 


W:={fEPn | f(8) =0} 
is a vector subspace. 


Exercise 9.7.6 Show that W := {f €C(0,1] | f’(4) exists} is a vector 
space. 


Exercise 9.7.7 Let W, and W2 be vector subspaces of V. Show that 
W, + W2 is a vector subspace of V. 


Exercise 9.7.8 Find the vector subspaces of R?. 


Exercise 9.7.9 Let $ = { (z,y) € R? | e+y= Li What is the 
subspace L(S) spanned by S? 


Exercise 9.7.10 Prove that if T: V —~ W is a linear map and 
dim ker T = dim V, 
then T = 0. 


Exercise 9.7.11 If T,S: V — V, then show that ker S C kerTS. 


Exercise 9.7.12 If T: V + W is a linear map with kerT # {0}, then 
there are vectors v; and v2 in V such that Tv, = Tv. 


Exercise 9.7.13 Show that if T € L(V,W) with dimV > dim W, then 
there is a nonzero vector v € V with Tv = 0. 


Exercise 9.7.14 Let T: KR’ = R3 be a linear map with 
Te = 2, Te2 = €3 and Te; = 0, 
Then T #0, T? £0 but T3 = 0. 


Exercise 9.7.15 If V is an inner product space, and W is a vector 
subspace of V, then (W+)-> = W. 


Scanned by CamScanner 


Miscellaneous Problems 213 
a 


Exercise 9.7.16 Let V be an n-dimensional inner product space. If v€ V 
is perpendicular to n linearly independent vectors, then v = 0. 


Exercise 9.7.17 Let vj, 1 < i < n be vectors in an inner product space 
such that (vj,v;) = 0 if i # 7. Show that {v; | 1<i<n} is linearly 
independent. 


Exercise 9.7.18 The matrix representation of the identity transformation 
of a vector space is always the identity matrix. 


Exercise 9.7.19 Show that an n x n matrix A commutes with every 
diagonal matrix if and only if A is diagonal. 


Exercise 9.7.20 Let 7’: R® - R® be defined by 
T(z,y,2) =(c+y,y+z,z2+2). 


Find a similar formula for T~’. 
Exercise 9.7.21 IfT: V - V is linear and T~! exists, then T~? is linear. 


Exercise 9.7.22 Let a linear map T : V + V be invertible. If u,... , v, 
are linearly independent so are {7',... , Tv-}. 


Exercise 9.7.23 Which of the following are vector subspaces? 


(1) {(z,y,2) | c=2}CR (2) {(z,y,z) | z=2=0} CR? 
(3) {(z,y,z) | z>O}CR> (4) {(z,y,2z) | z=z+y} CR’. 


Exercise 9.7.24 What is the geometric description of the subspace of R® 
spanned by {(1,0, 1), (1, —1,0)}? 
Exercise 9.7.25 Find the subspace spanned by {1,2 -«a, (x — a)*} in P3. 


Exercise 9.7.26 Which of the following sets span R?? 
(1) {(1,-1, 2), (0,0, 1)}. 
(2) {(1,0,0), (0, 1,0), (0,0,1), (1,1,2)). 


(3) {(1,0,0), (0,1, 1), (0,1,-1)}. 


Exercise 9.7.27 Let {z,y,z} be a linearly independent subset of V. Let 
u=2,v=a¢+yandw=2+y+tz. Prove that {u,v,w} is linearly 
independent. 


Exercise 9.7.28 Let 5; C Sj. Then show that 


(1) If S; is linearly independent so is 55. 


Scanned by CamScanner 


214 Review Problems 
————— OO rr—s—e—mss—CCCCC 


(2) If Sp is linearly independent so is Sj. 


Exercise 9,7.29 | " _ is a basis of R? if and only if 
Yi y2 


Exercise 9.7.30 No set of n—1 vectors can span an n-dimensional vector 
space, 


Exercise 9.7.31 Which of the following sets are bases of R?? 


(1) {(1,2), (1, -1)} (2) — {(1,0), (0,1), (0,0)} 
(3) {(1,1), (1,2), (1,0)} (4) {(1,-1)}. 


Exercise 9.7.32 Find a basis for the following subspaces of R°. 
(1) {(z,y,2z) | z=2+y}. 
(2) {(z,y,z) | t=}. 
(3) {(z,y,2) | 2 =0}. 
(4) {(z,y,2z) | az+by+cz=0,a 4 0}. 
Uxercise 9.7.33 Find an orthonormal basis of R® containing the vectors 


(1,0, 1) (1, -1,0) 
27, a and aay Ts 


Exercise 9.7.34 Apply Gram-Schmidt process to obtain an orthonormal 
basis from 


{(1,0,1),(1)—1,0), (1, 1, 1)}, 
Exercise 9.7.35 Which of the following are linear maps? 


x zr+y 
(1) Jylro] 2 from R? + RS, 
z 


T—-2 


2 
(2) Fie from R? > R?, 
y zt+y 


Scanned by CamScanner 


Miscellaneous Problems 215 


x 
(3) |y} [777] from Re 5 Re, 
yz 


Exercise 9.7.36 Let T: V — W bea linear map. If {v,... , uz} spans 
V, then {Tu | 1<i<k} spans Im T. 


Exercise 9.7.37 If T : V + W is linear and {v,...,v,} is such that 
{Tv,...,Tv,} is linearly independent, so is {vy,... , vp}. 


Exercise 9.7.38 Let T: V = R? > R? = W be given by 
(z,y) 4 (t+y,2-y). 


Let {e1,¢2} be the standard basis of R? and {v; = (1,1), v2 = (1,-1)} be 
another basis of R?. Compute the matrix representation of T with respect 
to: 


1) The natural basis of R?. 

2) The standard basis of V and {v, v2} of W. 

3) The basis {v1,v2} of V and {e;,e2} of W. : 
) 


( 
( 
( 
(4) The basis {vi, v2} of V and W. 


———T , OOOO — 


Scanned by CamScanner 


HO 


Index 


Adjoint, 61 Generators, 36 
Angle, 96 Gram-Schmidt process, 109 
Basic, 29 Hyperplane, 47 
orthonormal, 105 in an inner product space, 117 
standard, 30 normal to, 117 


Bilinear map, 139 


Idempotent, 113 


Cayley-Hamilton theorem, 174 Inclusion, 54 
Characteristic polynomial, 168, 172  /mmer product, 89 
Column rank, 77 space, 89 _ 
Column space, 77 Isomorphism, 72 
Conic, 185 


Coordinate, 34 
functions, 34 
Coordinate functions, 56 


Kernel of a linear map, 68 
Kronecker delta, 105 


Coset, 47 
Cramer's rule, 154 : ‘s 
Cross product, 160, 161 =p cosines, 87 mn 
Linear 
combination, 24, 25, 36 
Determinant, 143 £ dependence, 25 
van der Monde, 174 form, 56 
Direct sum, 19, 42 functional, 56 
Distance, 89 independence, 28 
Dot product, 87, 89 map 
diagonalizable, 170 
symmetric, 114, 176 
Eigenspace, 174 _ Span, 24, 36 
eigenvalue, 165, 167 Lines, 43 
Eigenvector, 165, 167 direction of, 43 
Elementary operations, 39 direction vector of, 44 


/Exidomorphism; 56 paralisl..s 


{ fad , ) 217 
j ; 


Scanned by CamScanner 


Matre 


adjunct of, 153 
cofactor of, 151 
minor of, 151 
non-singular, 78 

of a linear map, 62 
premutation, 65 
rank of, 78 
skew-symmetric, 19 
symmetric, 19 


Multilinear map, 139 


skew-symmetric, 141 


Nullity, 68 


Oriented volume, 160 
Orthogonal 


complement, 110 
decomposition, 111 
direct sum, 111 
linear maps, 121 
projection, 102, 112 
set, 107 


Parallelepiped, 136 


vertices of, 136 


Projection, 54, 56 


Quotient space, 52 


r-linear map, 139 

Rank, 68 

Reflection, 127,. 129 

Riesz representation theorem, 114 
Rigid motion, 124 

Rotation, 129 

Row rank, 77 

Row space, 77 


Subspace, 21 
affine, 46 
invariant, 178 
vector, 21 


Translation, 55 
Triangle inequality, 94 


Vector product, 161 
Vector space, 13 
dimension of, 34 
dual of, 56 
finite dimensional, 32 
Vectors, 13 
length of, 87, 89, 91 
norm of, 87, 89, 91 
orthogonal, 96 


0 UT 


Central Ubrary ITI 


Scanned by CamScanner 


